On Thu, Mar 19, 2020 at 4:33 PM Jon Maloy <[email protected]> wrote: > > > > On 3/19/20 7:03 PM, Tom Herbert wrote: > > On Thu, Mar 19, 2020 at 3:43 PM Jon Maloy <[email protected]> wrote: > >> > >> > >> On 3/18/20 12:04 AM, Joseph Touch wrote: > >> > >> Hi all, > >> > >> I’m quite confused by this request. > >> > >> It seems like they either have an implementation issue (in Linux). > >> > >> Linux "passthru" GSO is implemented so that any IP based protocol which > >> wants to benefit > >> from it needs its own IP protocol number. Doing this generically through > >> the already existing > >> UDP protocol number is not possible, because GSO on a host must be > >> implemented > >> specifically (e.g., regarding segmentation) per carried protocol. That is > >> just a fact, and not > >> an implementation issue. > > Jon, > > > > I'm not sure I understand your point. Linux already supports GSO, and > > GRO for that matter, for several protocols encapsulated over UDP. I > > don't see any requirement for a protocol to need its own IP protocol > > number in this regard. > > > > Tom > Yes, but this is not about guest GSO. What we need is something more > similar to TCP TSO, where we can send full-size buffers down to the > host OS, and only do segmentation (or in our case, a TIPC specific > fragmentation where each fragment gets an individually numbered header) > when we find that the destination is off-host. > Basically we want to transport full-size messages between VMs when those > are located in the same host. So far, I haven´t found any way to > do this on the host by looking at the inner protocol carried over UDP. > But I may of course be wrong at this point, I know you are the expert. >
Jon, You might want to look at Willem's work in UDP GSO (http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-presentation-20181104.pdf). That might be useful as a generic method assuming the proper APIs are supported (this is exactly how QUIC GSO was solved without needing explict kernel support for QUIC). Tom > ///jon > > > > >> > >> I checked their documentation, which includes smoothing that looks a > >> little like an Internet Draft: > >> http://tipc.io/protocol.html > >> but it’s quite confusing. Taken at face value, they make their own > >> argument that IP addresses won’t work - at which point running raw over IP > >> serves no utility (sec 3.1.1), > >> > >> That is not a correct interpretation of the text. There is nowhere stated > >> that IP addresses won't work for TIPC, > >> neither in sec. 3.1.1 or anywhere else. Of course they work, *for > >> transport purposes*, just like they have been > >> doing for many years already when running TIPC over UDP. What we state > >> elsewhere in the document is that > >> IP addresses are no good in the *user API*, because they are location > >> bound. > >> That is also why DNS was invented, I believe. > >> > >> We also state that using IP addresses is less optimal than omitting the IP > >> layer altogether > >> and using MAC addresses, but that doesn't mean the former are useless, -it > >> just makes > >> IP the only viable alternative in the cases when a network owner doesn't > >> allow non-IP > >> protocols though their back planes, or when routing gets involved. > >> > >> even though most of those claims are debatable (DNS-SD is too static? And > >> expensive?? How so?). Then they reinvent the DNS in Section 6. > >> > >> There is no doubt that DNS is not the best choice for the type of > >> environments (tight clusters) where > >> we use TIPC. All DNS implementations I know run in user land, and doing a > >> service discovery typically > >> means at least one, and often several inter-process and potentially > >> inter-node hops. Even if there is > >> a process local lookup cache in each sender, that cache has to be > >> populated before it is of any use. > >> Instead, TIPC uses a tailor-made kernel resident translation service which > >> normally contains a complete > >> copy of the the lookup database, so there are no unnecessary hops and no > >> cache misses. > >> > >> This would have been of less importance if TIPC were only a connection > >> oriented TCP-like service where > >> service lookup is only needed at connection setup. But a just as important > >> feature of TIPC is its reliable > >> connectionless transport mode. Here, the lookup service is not primarily > >> about service discovery > >> (although that is also important), but about efficient on-the-fly > >> translation between user level service > >> addresses (aka "port names") and location bound socket addresses (aka > >> "port identities"). This > >> translation has to be performed per message, not per connection, since the > >> destination may change > >> between each message. > >> > >> If we were to make an analogy with the IP world, we could imagine that we > >> use UDP to send high > >> volume traffic to many different destinations, each having its own domain > >> name. Making a > >> separate DNS lookup for each sent message would certainly work, but it > >> would not by far be as > >> performant as having a tailor made "always cache resident" translation > >> table, shared between > >> all processes, like we do in TIPC. > >> > >> Furthermore, when the connectionless service is used, sockets might be > >> created/deleted and > >> bound/unbound at extremely high rates, much higher than DNS with its > >> hierarchical updates > >> is meant to deal with. This is what we mean with DNS being too "static". > >> It is not saying that > >> DNS is bad, it is just stating that it is not designed for the very high > >> performance requirements > >> and dynamism we have in TIPC. > >> > >> There is no doubt that a few things in TIPC could have been done > >> differently, but the decision > >> to design our own topology/lookup service is not among those. This request > >> is an attempt to > >> open up for moving beyond some current limitations, e.g., by enabling > >> introduction of a more > >> versatile 128-bit service addressing concept. Along with this request we > >> are aiming at having > >> an updated version of the protocol description adopted as an informational > >> RFC, so that > >> TIPC can be regarded as an IETF supported protocol in its own right. > >> > >> Whatever the viewpoints, TIPC is currently what it is, and rather than > >> focusing on the motivation > >> for certain implementation choices and how they work, I think IETF should > >> consider the fact > >> that this is a well-established service used by dozens of small and big > >> companies, running high-volume > >> traffic at hundreds of telco sites around the globe. They should also > >> consider that TIPC has > >> existed as a stable and well-maintained implementation in all major Linux > >> distros for many years. > >> > >> IETF now has a genuine chance to help us making TIPC even more useful for > >> existing and new users. > >> > >> BR > >> Jon Maloy > >> > >> > >> Frankly, IMO this would probably have a difficult time arguing for a > >> transport protocol port number, much less an IP protocol number. > >> > >> Joe > >> > >> > >> On Mar 17, 2020, at 3:34 PM, Suresh Krishnan <[email protected]> wrote: > >> > >> Hi all, > >> IANA received an IP protocol number allocation request from Jon Maloy > >> <[email protected]> for the Transparent Inter Process Communication (TIPC) > >> protocol. I picked up this request as Internet AD as the registration > >> procedure requires IESG Approval. I had provided the information below to > >> the IESG and discussed this with a favorable view of this request. I am > >> recommending allocation of an IP protocol number for this. If you have any > >> concerns that you think I might have overlooked, please let me know by end > >> of day March 24 2020. > >> > >> After several round trips of back and forth probing I had collected the > >> following information regarding the protocol number request for TIPC. > >> There were two main questions I had for him: > >> > >> * Q1: Why did they want an IP protocol number? > >> * Q2: Is the protocol implemented and deployed widely? > >> > >> Q1: Why did they want an IP protocol number? > >> ==================================== > >> > >> There are two main reasons why they want to reserve an IP protocol number: > >> > >> 1) Performance > >> They are currently working on adding GSO support to TIPC, including a > >> TSO-like "full-size buffer pass-thru" though virtio and the host OS tap > >> interface. They have experimentally implemented GSO across UDP tunnels, > >> but performance is not good because of the way the tunnel GSO is > >> implemented, and there is no 'pass-thru' support for this in Linux. They > >> have even done the same at the pure L2 level, but L2 transport is > >> sometimes not accepted by the cloud maintainers or the telco operators, > >> and hence they need an alternative. The best alternative, both from a > >> performance and acceptability viewpoint would be to establish TIPC as a > >> full-fledged IP protocol, apart from the traditional L2 bearer many users > >> are still using. > >> > >> 2) Currently TIPC has two user address types: > >> > >> struct tipc_service_addr{ > >> uint32_t type; > >> uint32_t instance; > >> uint32_t node; > >> }; > >> struct tipc_service_addr{ > >> uint32_t port; > >> uint32_t node; > >> }; > >> > >> They want to complement this with a new API where we have a unified > >> address type: > >> struct tipc_addr{ > >> u8 type[16]; > >> u8 instance[16]; > >> u8 node[16]; > >> }; > >> > >> This would give a 128-bit value range for both 'type', 'instance' and > >> 'node', and opens up for new opportunities: > >> - Users will never need to coordinate 'type' values since there will no > >> risk of collisions. > >> - Users can put whatever they want into the fields, e.g., an IPv6 address, > >> a Kubernetes or Docker container id, a LUKS disk UUID or just a plain > >> string. > >> For the 'node' id this has already been implemented and released, but it > >> is not reflected in the API yet. > >> > >> For the API extension they need a new IPPROTO_TIPC socket type which can > >> be registered and instantiated independently from the traditional AF_TIPC > >> socket type. > >> > >> You can find more info about this at http://tipc.io > >> > >> Q2: Is the protocol implemented and deployed widely? > >> ========================================== > >> > >> The requester provided the following information when I asked about who > >> was currently using TIPC (pretty much about adoption and deployment): > >> > >> I can give you a list of current or recently active code contributors and > >> companies/people who have been asking for support: > >> > >> Huawei: > >> For natural reasons I don't know any details about them, I can only name > >> persons I have seen contributing to netdev or being active on our mailing > >> lists. Huawei people sometimes use gmail addresses when posting questions > >> and patches, so there are more persons than I have listed here. > >> Dmitry Kolmakov <[email protected]> > >> Ji Qin <[email protected]> > >> Wei Yongjun <[email protected]> > >> <[email protected]> > >> Yue Haibing <[email protected]> > >> Junwei Hu <[email protected]> > >> Jie Liu <[email protected]> > >> Qiang Ning <[email protected]> > >> Zhiqiang Liu <[email protected]> > >> Miaohe Lin <[email protected]> > >> Wang Wang <[email protected]> > >> Kang Zhou <[email protected]> > >> Suanming Mou <[email protected]> > >> > >> Hu Junwei is the one I see most active at the moment. > >> > >> Nokia: > >> Tommi Rantala <[email protected]> > >> > >> Verizon: > >> Amar Nv <[email protected]> > >> Jayaraj Wilson, <[email protected]> > >> > >> Hewlett Packard Enterprise: > >> <[email protected]> > >> > >> WindRiver: > >> Ying Xue <[email protected]> > >> He is my co-maintainer at netdev ans sourcefoge. > >> Windriver has several products in the field based on TIPC, e.g. control > >> system for Sikorsky helicopters. > >> > >> Orange: > >> Christophe JAILLET <[email protected]> > >> > >> Redhat: > >> The person contacting me to have TIPC integrated and maintained in > >> RHEL-8.0 was > >> Sirius Rayner-Karlsson <[email protected]> > >> He motivated it with a request from "a telco vendor", but I don't know > >> which one. > >> Hence, TIPC is now integrated in and officially supported from RHEL 8.1 > >> > >> ABB: > >> https://new.abb.com/pl > >> Mikolaj K. Chojnacki <[email protected]> > >> Krzysztof Rybak <[email protected]> > >> > >> Ericsson: > >> All (dozens of) applications based on the TSP and Core > >> Middleware/Components Based Architecture (CMW/CBA) platforms is per > >> definition based on TIPC. They have not yet started to use TIPC on their > >> Kubernetes based ADP platform, but there is work ongoing on this. > >> > >> I also see numerous other people being active, from small (I believe) > >> companies, universities and private contributors. E.g., > >> Innovsys Inc http://www.innovsys.com/innovsys/ > >> Allied Telesis https://www.alliedtelesis.com/ > >> Telaverge Communications http://www.telaverge.com/ > >> Ivan Serdyuk <[email protected]> (seems to be responsible for > >> the ZeroMQ port of TIPC) > >> John Hopkins University / Fast LTA, Munich <[email protected]> > >> Just to mention a few... > >> > >> TIPC is currently maintained jointly by Ericsson, WindRiver, Redhat, and > >> the Australian consulting company DEK Technologies > >> https://www.dektech.com.au/ > >> > >> Thanks > >> Suresh > >> > >> _______________________________________________ > >> Int-area mailing list > >> [email protected] > >> https://www.ietf.org/mailman/listinfo/int-area > >> > >> > >> > >> _______________________________________________ > >> Int-area mailing list > >> [email protected] > >> https://www.ietf.org/mailman/listinfo/int-area > >> > >> > >> _______________________________________________ > >> Int-area mailing list > >> [email protected] > >> https://www.ietf.org/mailman/listinfo/int-area > _______________________________________________ Int-area mailing list [email protected] https://www.ietf.org/mailman/listinfo/int-area
