On 3/19/20 8:07 PM, Tom Herbert wrote:
On Thu, Mar 19, 2020 at 4:33 PM Jon Maloy <jma...@redhat.com> wrote:


On 3/19/20 7:03 PM, Tom Herbert wrote:
On Thu, Mar 19, 2020 at 3:43 PM Jon Maloy <jma...@redhat.com> wrote:

On 3/18/20 12:04 AM, Joseph Touch wrote:

Hi all,

I’m quite confused by this request.

It seems like they either have an implementation issue (in Linux).

Linux "passthru" GSO is implemented so that any IP based protocol which wants 
to benefit
from it needs its own IP protocol number. Doing this generically through the 
already existing
UDP protocol number is not possible, because GSO on a host must be implemented
specifically (e.g., regarding segmentation) per carried protocol. That is just 
a fact, and not
an implementation issue.
Jon,

I'm not sure I understand your point. Linux already supports GSO, and
GRO for that matter, for several protocols encapsulated over UDP. I
don't see any requirement for a protocol to need its own IP protocol
number in this regard.

Tom
Yes, but this is not about guest GSO. What we need is something more
similar to TCP TSO, where we can send full-size buffers down to the
host OS, and only do segmentation (or in our case, a TIPC specific
fragmentation where each fragment gets an individually numbered header)
when we find that the destination is off-host.
Basically we want to transport full-size messages between VMs when those
are located in the same host. So far, I haven´t found any way to
do this on the host by looking at the inner protocol carried over UDP.
But I may of course be wrong at this point, I know you are the expert.

Jon,

You might want to look at Willem's work in UDP GSO
(http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-presentation-20181104.pdf).
That might be useful as a generic method assuming the proper APIs are
supported (this is exactly how QUIC GSO was solved without needing
explict kernel support for QUIC).

Tom
Hi Tom,
I´ll take a look at this. Thank you for the tip.
///jon

///jon

I checked their documentation, which includes smoothing that looks a little 
like an Internet Draft:
http://tipc.io/protocol.html
but it’s quite confusing. Taken at face value, they make their own argument 
that IP addresses won’t work - at which point running raw over IP serves no 
utility (sec 3.1.1),

That is not a correct interpretation of the text. There is nowhere stated that 
IP addresses won't work for TIPC,
neither in sec. 3.1.1 or anywhere else. Of course they work, *for transport 
purposes*, just like they have been
doing for many years already when running TIPC over UDP. What we state 
elsewhere in the document is that
IP addresses are no good in the *user API*, because they are location bound.
That is also why DNS was invented, I  believe.

We also state that using IP addresses is less optimal than omitting the IP 
layer altogether
and using MAC addresses, but that doesn't mean the former are useless, -it just 
makes
IP the only viable alternative in the cases when a network owner doesn't allow 
non-IP
protocols though their back planes, or when routing gets involved.

even though most of those claims are debatable (DNS-SD is too static? And 
expensive?? How so?). Then they reinvent the DNS in Section 6.

There is no doubt that DNS is not the best choice for the type of environments 
(tight clusters) where
we use TIPC. All DNS implementations I know run in user land, and doing a 
service discovery typically
means at least one, and often several inter-process and potentially inter-node 
hops. Even if there is
a process local lookup cache in each sender, that cache has to be populated 
before it is of any use.
Instead, TIPC uses a tailor-made kernel resident translation service which 
normally contains a complete
copy of the the lookup database, so there are no unnecessary hops and no cache 
misses.

This would have been of less importance if TIPC were only a connection oriented 
TCP-like service where
service lookup is only needed at connection setup. But a just as important 
feature of TIPC is its reliable
connectionless transport mode. Here, the lookup service is not primarily about 
service discovery
(although that is also important), but about efficient on-the-fly translation 
between user level service
addresses (aka "port names") and location bound socket addresses (aka "port 
identities"). This
translation has to be performed per message, not per connection, since the 
destination may change
between each message.

If we were to make an analogy with the IP world, we could imagine that we use 
UDP to send high
volume traffic to many different destinations, each having its own domain name. 
Making a
separate DNS lookup for each sent message would certainly work, but it would 
not by far be as
performant as having a tailor made "always cache resident" translation table, 
shared between
all processes, like we do in TIPC.

Furthermore, when the connectionless service is used, sockets might be 
created/deleted and
bound/unbound at extremely high rates, much higher than DNS with its 
hierarchical updates
is meant to deal with. This is what we mean with DNS being too "static". It is 
not saying that
DNS is bad, it is just stating that it is not designed for the very high 
performance requirements
and dynamism we have in TIPC.

There is no doubt that a few things in TIPC could have been done differently,  
but the decision
to design our own topology/lookup service is not among those. This request is 
an attempt to
open up for moving beyond some current limitations, e.g., by enabling 
introduction of a more
versatile 128-bit  service addressing concept.  Along with this request we are 
aiming at having
an updated version of the protocol description adopted as an informational RFC, 
so that
TIPC can be regarded as an IETF supported protocol in its own right.

Whatever the viewpoints, TIPC is currently what it is, and rather than focusing 
on the motivation
for certain implementation choices and how they work, I think IETF should 
consider the fact
that this is a well-established service used by dozens of small and big 
companies, running high-volume
traffic at hundreds of telco sites around the globe. They should also consider 
that TIPC has
existed as a stable and well-maintained implementation in all major Linux 
distros for many years.

IETF now has a genuine chance to help us making TIPC even more useful for 
existing and new users.

BR
Jon Maloy


Frankly, IMO this would probably have a difficult time arguing for a transport 
protocol port number, much less an IP protocol number.

Joe


On Mar 17, 2020, at 3:34 PM, Suresh Krishnan <sur...@kaloom.com> wrote:

Hi all,
    IANA received an IP protocol number allocation request from Jon Maloy 
<jma...@redhat.com> for the Transparent Inter Process Communication (TIPC) 
protocol. I picked up this request as Internet AD as the registration procedure 
requires IESG Approval. I had provided the information below to the IESG and 
discussed this with a favorable view of this request. I am recommending allocation of 
an IP protocol number for this. If you have any concerns that you think I might have 
overlooked, please let me know by end of day March 24 2020.

After several round trips of back and forth probing I had collected the 
following information regarding the protocol number request for TIPC. There 
were two main questions I had for him:

* Q1: Why did they want an IP protocol number?
* Q2: Is the protocol implemented and deployed widely?

Q1: Why did they want an IP protocol number?
====================================

There are two main reasons why they want to reserve an IP protocol number:

1)  Performance
They are currently working on adding GSO support to TIPC, including a TSO-like 
"full-size buffer pass-thru" though virtio and the host OS tap interface. They 
have experimentally implemented GSO across UDP tunnels, but performance is not good 
because of the way the tunnel GSO is implemented, and there is no 'pass-thru' support for 
this in Linux. They have even done the same at the pure L2 level, but L2 transport is 
sometimes not accepted by the cloud maintainers or the telco operators, and hence they 
need an alternative. The best alternative, both from a performance and acceptability 
viewpoint would be to establish TIPC as a full-fledged IP protocol, apart from the 
traditional L2 bearer many users are still using.

2) Currently TIPC has two user address types:

struct tipc_service_addr{
      uint32_t type;
      uint32_t instance;
      uint32_t node;
};
struct tipc_service_addr{
      uint32_t port;
      uint32_t node;
};

They want to complement this  with a new API where we have a unified address 
type:
struct tipc_addr{
     u8 type[16];
     u8 instance[16];
     u8 node[16];
};

This would give a 128-bit value range for both 'type', 'instance' and 'node', 
and opens up for new opportunities:
- Users will never need to coordinate 'type' values since there will no risk of 
collisions.
- Users can put whatever they want into the fields, e.g., an IPv6 address, a 
Kubernetes or Docker container id, a LUKS disk UUID or just a plain string.
For the 'node' id this has already been implemented and released, but it is not 
reflected in the API yet.

For the API extension they need a new IPPROTO_TIPC socket type which can be 
registered and instantiated independently from the traditional AF_TIPC socket 
type.

You can find more info about this at http://tipc.io

Q2: Is the protocol implemented and deployed widely?
==========================================

The requester provided the following information when I asked about who was 
currently using TIPC (pretty much about adoption and deployment):

I can give you a list of current or recently active code contributors and 
companies/people who have been asking for support:

Huawei:
For natural reasons I don't know any details about them, I can only name 
persons I have seen contributing to netdev or being active on our mailing 
lists. Huawei people sometimes use gmail addresses when posting questions and 
patches, so there are more persons than I have listed here.
Dmitry Kolmakov <kolmakov.dmit...@huawei.com>
Ji Qin <jiqin...@huawei.com>
Wei Yongjun <weiyongj...@huawei.com>
<songshuaishu...@huawei.com>
Yue Haibing <yuehaib...@huawei.com>
Junwei Hu <hujunw...@huawei.com>
Jie Liu <liujie...@huawei.com>
Qiang Ning <ningqia...@huawei.com>
Zhiqiang Liu <liuzhiqian...@huawei.com>
Miaohe Lin <linmia...@huawei.com>
Wang Wang <wangwa...@huawei.com>
Kang Zhou <zhouka...@huawei.com>
Suanming Mou <mousuanm...@huawei.com>

Hu Junwei is the one I see most active at the moment.

Nokia:
Tommi Rantala <tommi.t.rant...@nokia.com>

Verizon:
Amar Nv <amar...@in..verizon..com>
Jayaraj Wilson, <jayaraj.wil...@in.verizon.com>

Hewlett Packard Enterprise:
<jonas.ar...@hpe.com>

WindRiver:
Ying Xue <ying....@windriver.com>
He is my co-maintainer at netdev ans sourcefoge.
Windriver has several products in the field based on TIPC, e.g. control system 
for Sikorsky helicopters.

Orange:
Christophe JAILLET <christophe.jail...@wanadoo.fr>

Redhat:
The person contacting me to have TIPC integrated and maintained in RHEL-8.0 was
Sirius Rayner-Karlsson <akarls...@redhat.com>
He motivated it with a request from "a telco vendor", but I don't know which 
one.
Hence, TIPC is now integrated in and officially supported from RHEL 8.1

ABB:
https://new.abb.com/pl
Mikolaj K. Chojnacki <mikolaj.k.chojna...@pl.abb.com>
Krzysztof Rybak <krzysztof.ry...@pl.abb.com>

Ericsson:
All (dozens of) applications based on the TSP and Core Middleware/Components 
Based Architecture (CMW/CBA) platforms is per definition based on TIPC. They 
have not yet started to use TIPC on their Kubernetes based ADP platform, but 
there is work ongoing on this.

I also see numerous other people being active, from small (I believe) 
companies, universities and private contributors. E.g.,
Innovsys Inc  http://www.innovsys.com/innovsys/
Allied Telesis https://www.alliedtelesis.com/
Telaverge Communications http://www.telaverge.com/
Ivan Serdyuk <local.tourist.k...@gmail.com> (seems to be responsible for the 
ZeroMQ port of TIPC)
John Hopkins University / Fast LTA, Munich <peter.hans.froehl...@gmail.com>
Just to mention a few...

TIPC is currently maintained jointly by Ericsson, WindRiver, Redhat, and the 
Australian consulting company DEK Technologies https://www.dektech.com.au/

Thanks
Suresh

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area



_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area


_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to