Re: [Taps] MTU / equivalent at the transport layer

Michael Welzl Mon, 12 Dec 2016 01:32:29 -0800

Hi,

Just trying to understand, so we're not talking past each other. Please note 
that I'm not trying to argue in any direction with my comments below, just 
asking for clarification:



> On 09 Dec 2016, at 18:32, Joe Touch <[email protected]> wrote:
> 
> 
> 
> On 12/9/2016 8:12 AM, Michael Welzl wrote:
>>> On 09 Dec 2016, at 16:18, Joe Touch <[email protected]> wrote:
>>> 
>>> 
>>> 
>>> On 12/9/2016 12:09 AM, Michael Welzl wrote:
>>>>> On 07 Dec 2016, at 20:29, Joe Touch <[email protected]> wrote:
>>>>> 
>>>>> FYI, there are two different "largest messages" at the transport layer:
>>>>> 
>>>>> 1) the size of the message that CAN be delivered at all
>>>> True... I wasn't thinking of that, but yes.
>>>> 
>>>> 
>>>>> 2) the size of the message that can be delivered without network-layer
>>>>> fragmentation (there are no guarantees about link-layer - see ATM or the
>>>>> recent discussion on tunnel MTUs on INTAREA)
>>>>> 
>>>>> MTU generally refers to the *link payload*. At that point, transports
>>>>> have to account for network headers, network options, transport headers,
>>>>> and transport options too. See RFC6691.
>>>>> 
>>>>> MSS refers to the transport message size AFAICT. It is *sometimes* tuned
>>>>> to MTU-headers but not always.
>>>>> 
>>>>> E.g., for IPv6, link MTU is required to be at least 1280, but the
>>>>> src-dst transit MTU is required to be at least 1500. So a transport that
>>>>> wants to match sizes and reduce fragmentation issues would pick
>>>>> 1280-IPh-IPo-TCPh-TCPo, but a transport is supposed to be able to trust
>>>>> that 1500-IPh-IPo-TCPh-TCPo can still get through at least some of the 
>>>>> time.
>>>> So I'm getting the impression that the answer to my question really is 
>>>> that, to figure out 2)  (which I was concerned with), an application 
>>>> programmer needs to do the calculation her/himself.
>>> To figure out 2), the transport layer needs to know the unfragmented
>>> link MTU, the size of all of the network headers (including options),
>>> and the sizes of its own headers and options.
>>> 
>>> It's also sometimes assumes that the transport can control the "DF" bit
>>> (for IPv4).
>> Yes - but that hardly sounds worse to me than requiring the application 
>> programmer to do this protocol-specific calculation by hand...
> 
> The app programmer needs to know what the transport can support, the
> transport needs to know what net supports, etc.
> 
> Pushing the link MTU up the line and expecting all the other layers to
> figure out what to do results in unnecessary complexity, never mind
> undermining one of the key features of layering.

Either we just agree here, or you're saying that your 2) above should not be 
exposed? Or something else?


>>> However, this all breaks down if the app makes the wrong choice because
>>> the net can (will, and should) source fragment if it gets a message that
>>> turns out  to be too big for one fragment anyway.
>>> 
>>>> Not a big deal - and maybe some systems offer a function to give you the 
>>>> size of a message that won't be fragmented.
>>> Remember that - at best - you're optimizing for the next layer down
>>> only. You can't know whether that net layer message is link fragmented
>>> (e.g., as in ATM) or tunnel fragmented (as needs to be required or this
>>> whole MTU concept breaks down).
>> Sure - but that's something end systems just can't see. It's information up 
>> to and including the IP layer that should be correctly handed over up the 
>> stack, inside the host, with all the caveats this information comes with.
> Why does that apply at the link layer but not other layers? If transport
> can transfer and reassemble 1MB messages, then that's the "MTU" it needs
> to tell the app layer. The same is true for net to tell transport, etc.
> 
> We've conflated the two between transport and net unnecessarily.

So this sounds like you're saying that your item 2) above should not be exposed 
by the transport layer to the application.


>>>> However: this calculation is transport protocol dependent, which we really 
>>>> don't want to have in TAPS.
>>> If you want to fix this, you need to change the API to the net layer to
>>> provide immediate feedback. When transport hands a segment to network,
>>> it has to get a "call failed" if the message is too big - and we really
>>> do need transport layers to be able to pick between "too big for
>>> non-fragmented net layer" and "too big for the net layer even with frag".
>>> 
>>> Merely handing info to the transport layer might not be enough, esp.
>>> when net layer option lengths change.
>> True if you want to cleanly fix it across the RFC-specified stack, but 
>> that's beyond the concern of TAPS - it becomes a requirement from the TAPS 
>> WG. Does that make sense?
> 
> Then this is part of the API requirements that TAPS should be
> indicating, no?

So what does that mean: that the API should contain a "don't fragment" flag 
from the application?


Cheers,
Michael

_______________________________________________
Taps mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/taps

Re: [Taps] MTU / equivalent at the transport layer

Reply via email to