Re: [Int-area] Please review: draft-savola-mtufrag-network-tunneling-04.txt

Joe Touch Fri, 23 Sep 2005 13:17:47 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Pekka Savola wrote:
> Hi,
> 
> A high-order bit first,
> 
> The goal of the doc is to say "these are the scenarios, and the main
> solutions people have adopted.  One solution is incompliant, and here's
> where it causes problems".
> 
> Specifically, the goal is not and will not be either of:
> 
>        1. argue that the [IPv4 base] spec should be changed
>        2. argue that the violating [tunnel] implementations should be
>           changed

That's fine; it just clarifies some other aspects of the doc. I don't
agree that either should be pursued, FWIW.

...
> On the other hand, if you think I have not described an important
> solution (e.g., using multiple IP addresses for each
> encapsulator/decapsulator as needed), I'm open to text suggestions.

I hope to do that, with the caveat below.

> Apart from the major document focus issues, I have difficulty seeing
> which changes you'd like to see so I'll just have to ask "send text" and
> I'll check it out.

IMO, the major doc focus issues are more important and need to be
resolved before it would be useful to contribute text, since the latter
assumes that such pointwise patches would be productive, and I don't yet
see that.

> A few more bits inline, but I don't think I'll continue following up
> with this much longer as we're already going on circles and it isn't
> getting any much better.

Agreed. Hopefully the clarifications below will help the context of the
discussion for others to follow at least...

> On Thu, 15 Sep 2005, Joe Touch wrote:
> 
>> Pekka Savola wrote:
>>
>>> I already described one example of a utility w/ avoiding high-speed
>>> reassembly at a router, which just simply does not work.  I could query
>>> for others from vendors, but I don't think those use cases belong in the
>>> draft at least in a verbose manner. Including the use cases would seem
>>> to make even more compelling case for causing more violations in those
>>> cases than just being silent.  Further, it would cause ratholing
>>> discussions on whether folks see those use cases being really required
>>> or not, even though the specifics (and discussion about them) is not the
>>> point, just that the operators ARE using them and probably for good
>>> reasons.
>>
>>
>> Whether vendors fail to support requirements and whether there are good
>> reasons for their doing so are two different things; ignorance and
>> apathy can be as likely as real cause. You're right that we don't need
>> to go into a poll of current products or deployments, but I'm
>> uncomfortable with asserting that we have to live with it or change the
>> requirement because vendors don't correctly support it. If that were the
>> case, there would be no point to docs like RFC2525 (known TCP
>> implementation problems).
> 
> I think you're comparing apples to oranges.  RFC2525 seems to describe
> mainly unintentional implementation bugs and similar defects.

Agreed - I'm arguing that this doc should have more of RFC2525's tone of
"this is a bug", or a tone of "here's the change we propose to the spec"
(which we agree it should not). I don't see the purpose of the current
'apple' if it's not more like one of those oranges ;-)

>> The key here is whether the point of this doc is to suggest that the
>> spec be modified because of real reason, or to document current
>> violations of spec in this regard. The third option - which is how it's
>> currently described - is "here's what people do; it's a violation, but
>> people do it". I don't believe it's appropriate to even indirectly
>> validate broken implementations because of a herd mentality, so one of
>> the first two is preferable. I.e., either it has to be "there are real
>> reasons and the spec should be changed", or "people do it, it's a
>> violation, and here's where it causes problems".
> 
> I don't see "here's what people do; it's a violation, but people do it"
> as validating.

It is more validating than "here is a bug that people do that needs to
be fixed," ala RFC2525. I don't see the point of RFCs documenting
current practice that violates spec unless the goal is to change the
spec (which we agree is not the current goal) or the focus is to get the
bug fixed (which is not how the current doc is written).

> I don't want to go down to the rathole (in this document at least) of
> trying to specify how RFC791 or other similar specs should be changed.
> It's not the point of this effort.
> 
> The spirit of this doc is "people do it, it's a violation, and here's
> where it causes problems". (though currently the text on "here's where
> it causes problems" is not too extensive -- feel free to send text)
> 
>> And I don't buy the fact that people do it because the can't implement
>> it; ATM segmentation and reassembly works fine at very high speeds too,
>> and although segments must come in in-order, they are much smaller data
>> chunks (typically). Sure, it's expensive, but that's not a reason to
>> violate a spec.
> 
> Have you seen 10G ATM?  Even 2.5G ATM is very rare.  Oh by the way, did
> you notice that ATM has lost almost all of its operational relevance
> over 5 years ago?  There may be a connection here.

CAIDA has notes about OC48 traces - that's 7.5 Gbps.

ATM lost its relevance for a number of reasons - many of which focus on
the extremely low data chunksize and large overhead. The complexity of
SAR hardware may or may not have had much to do with that, but SAR
hardware capable of gigabit speeds was designed by classmates of mine
back in the early 1990's, 5 years before it gige was available.

>>>> It's
>>>> even OK to drop packets that are fragments altogether if they hit a
>>>> tunnel that won't carry them.
>>>
>>>
>>> If we assume MTU=1500, would it be OK to drop all the packets with size
>>> 1472 (or something thereabouts) which have DF bit set?
>>
>>
>> Not only "OK", but _required_ as I read the specs. Yes, things that
>> don't adjust will see a blackhole - which we already know.
> 
> Sure, but it's not operationally acceptable that the users see a black
> hole.

Then most operational networks are unacceptable, since even tunnels
create those - since ICMP 'too big' messages are often blocked by firewalls.

>>> That's what
>>> we're talking about here (and the unfeasibility of signalling back
>>> "Packet too Big" to all those sources sending >1472 (or whatever)
>>> packets).  While RFC791 says it's OK, it certainly isn't in practise.
>>
>>
>> http://www.caida.org/analysis/workload/fragments/sdscposter.xml
>>
>> That shows that "A significant portion of the fragmented traffic that
>> crosses the UCSD-CERF link is tunneled traffic.". So it's practice by at
>> least one analysis.
>>
>> Having seen it on other tunnels all over the place (I do a lot of tunnel
>> research, FWIW), I would agree that it's possible to hit places that do
>> violate spec, but it hasn't affected the majority of paths.
> 
> Host-to-host tunneling or other tunneling on a small scale in other
> contexts certainly can use PMTUD -- also as described in section 3.2. I
> don't dispute it, and I certainly use it myself.
> 
> It just doesn't scale up enough.  Let's say for example that Internet2
> admins did some tunneling between routers -- in such a manner that PMTUD
> would be needed for every single packet between every single (src,dst)
> pair that traverses Internet2.  Would you consider it (operationally)
> OK?  I wouldn't -- it would mean that the routers would need to signal
> hundreds of millions times a day.

It is operationally OK, since such tunnels are already in use for pipes
between enterprises (VPNs).

>>>> It's still NOT OK to clear the DF bit in the outer header when it's set
>>>> in the inner, though.
>>>
>>>
>>> Sure, sure -- but that's still being done widely out there.  What would
>>> you prefer? Be hush-hush about it?  I want to bring the problems out in
>>> the open, with appropriate disclaimers of course.
>>
>>
>> As above, there are three options:
>>
>>     1. argue that the spec should be changed for cause
>>     2. argue that the violating implementations should be changed
>>
>> "Everyone does it", or even "many implementions do it" isn't a
>> sufficient reason to change a spec that has compliant alternatives (drop
>> the packets) especially when the correct operation of some protocols
>> (PMTU) depend on that behavior.
> 
> These options are different than the ones you said above; I'm fine with
> ""here's what people do; it's a violation, but people do it" but neither
> of the two above options is acceptable to me.  I don't go to the reasons
> (again) because we're obviously going circles on this.

Documenting existing practice and pointing out the violation as a
side-issue isn't acceptable to me because it ends up meaning that this
doc is a snapshot of current practice, which I don't think is
appropriate for an RFC. IMO, the RFC should make a statement that this
is either OK and the spec should be changed or that it's not OK and the
implementations should be fixed. "this is what we saw" stuff is
ephemeral by nature.

>>> I have no control on why the authors didn't continue the work; I
>>> certainly provided some suggestions for enhancements myself (note: if
>>> the authors are listening, I could consider picking up the draft if
>>> you'd like).  But that aside..
>>>
>>> Requiring hundreds or thousands of IP aliases to overcome IP ID wrapping
>>> is not a [real] solution.  Could you please state a better one if one
>>> exists?
>>
>> Huh? You want to change everyone's spec for tunneling, but the IP ID
>> space is not on the table? Why not have a 'large ID option'?
> 
> I'm pretty sure you know the study which showed that in most cases, IP
> options caused packets to be dropped.  Hardware implementations (e.g.,
> ACLs etc) also don't like IP options AFAICT.

Sure - but I thought we were talking about special cases where
supercomputers wanted IP backward compatibility? They surely can upgrade
the components on their path not to drop packets with this option since
they want 10Gbps throughput between two endpoints.

>> Besides, most hosts don't have an IP ID problem; even though gigabit
>> interfaces are common, the wrap problem comes up in supercomputer
>> contexts primarily.
> 
> As said, the problem is not on the hosts but routers.

OK - as per above, though, it's an issue only on routers that are on the
highspeed path betwen two highspeed endpoints - hardly a widescale problem.

>>> Wrt. the incompliancy, are you referring to this:
>>>
>>>     The choice of the Identifier for a datagram is based on the need to
>>>     provide a way to uniquely identify the fragments of a particular
>>>     datagram.  The protocol module assembling fragments judges fragments
>>>     to belong to the same datagram if they have the same source,
>>>     destination, protocol, and Identifier.  **Thus, the sender must
>>> choose
>>>     the Identifier to be unique for this source, destination pair and
>>>     protocol for the time the datagram (or any fragment of it) could be
>>>     alive in the internet.**
>>>
>>>     It seems then that a sending protocol module needs to keep a table
>>>     of Identifiers, one entry for each destination it has communicated
>>>     with in the last maximum packet lifetime for the internet.
>>
>>
>> It needs to ensure that the first paragraph isn't violated, which a
>> later paragraph addresses via a simpler mechanism that is more typical
>> in current use:
>>    However, since the Identifier field allows 65,536 different values,
>>    some host may be able to simply use unique identifiers independent
>>    of destination.
> 
> That doesn't help in practice with high-speed encap/decap with two
> routers because all (most) tunneling happens between fixed endpoints.
> Making the identifier space (src,dst) -specific doesn't help here.

Sure does - add more IP aliases for the tunnel endpoints!

>>> (esp last sentence of 1st paragraph and 2nd para.)
>>>
>>> I don't see any real solutions here.  I don't think it's acceptable to
>>> refuse to send any more packets to destination X until an ID slot is
>>> freed (a buffering problem, a denial of service issue), as it is not
>>> acceptable to switch source IP addresses because the peer would likely
>>> no longer recognize the session because the IPs changed.
>>
>>
>> If it's not acceptable, then we need to redo RFC791. Right now, even
>> when DF is set, the rule still applies.
> 
> I encourage you to write a draft on this.
> 
>> Note that although RFC791 defines the ID field as for fragmentation, it
>> never says that if the DF bit is set that the frag  field is not
>> meaningful or may be set to 0.
>>
>> RFC1122 does indicate that the IP ID field can be used by routers to
>> omit duplicate IP packets:
>>
>>                (2) a
>>                 congested gateway might use the IP Identification field
>>                 (and Fragment Offset) to discard duplicate datagrams
>>                 from the queue.
> 
> I don't see how this is relevant.

It means that you can't set the IP ID to zero because a congested
gateway can consider all such packets between the same two addresses as
if they are copies, and just drop all but the first - all the time.

That means tunneled traffic would be preferentially dropped at
intermediate routers on the tunnel path, which isn't RED or drop-tail.

>>> Again my question would be, what changes would you like to see in the
>>> draft to make this clearer?
>>
>>
>> I made some above...
> 
> I welcome specific suggestions (as in, "in section x, paragraph y,
> change [this] text to [that]) within the scope constraints of the document.
> 
> As it is, I will not to make major changes to the scope and focus of the
> document (against others feedback), so if you have minor changes you'd
> like to see in mind, please state them clearly.

We're at an impasse then; there's no utility in pursuing incremental
changes to a doc whose overall tone I disagree with.

Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDNGG7E5f5cImnZrsRAmaRAJ98PQtGvyaAIGV3hQpyFCwc5PQZmgCfdSXm
UdDc6PcIxrSuhVJWxHYW9Bo=
=+GP+
-----END PGP SIGNATURE-----

_______________________________________________
Int-area mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/int-area

Re: [Int-area] Please review: draft-savola-mtufrag-network-tunneling-04.txt

Reply via email to