Re: [Int-area] Please review: draft-savola-mtufrag-network-tunneling-04.txt

Pekka Savola Wed, 21 Sep 2005 01:10:34 -0700

Hi,

A high-order bit first,

The goal of the doc is to say "these are the scenarios, and the mainsolutions people have adopted. One solution is incompliant, andhere's where it causes problems".


Specifically, the goal is not and will not be either of:

       1. argue that the [IPv4 base] spec should be changed
       2. argue that the violating [tunnel] implementations should be
          changed

.. because v4 spec change is a rathole I will not want to dig in thisdocument (feel free to write a draft, I could even help), and thereare no real solutions as how they should be changed (yes, we probablydisagree on whether some of the proposals you made are real solutionsor not, but let's just agree to disagree on that).

On the other hand, if you think I have not described an importantsolution (e.g., using multiple IP addresses for eachencapsulator/decapsulator as needed), I'm open to text suggestions.

Apart from the major document focus issues, I have difficulty seeingwhich changes you'd like to see so I'll just have to ask "send text"and I'll check it out.

A few more bits inline, but I don't think I'll continue following upwith this much longer as we're already going on circles and it isn'tgetting any much better.


On Thu, 15 Sep 2005, Joe Touch wrote:

Pekka Savola wrote:

I already described one example of a utility w/ avoiding high-speed
reassembly at a router, which just simply does not work.  I could query
for others from vendors, but I don't think those use cases belong in the
draft at least in a verbose manner. Including the use cases would seem
to make even more compelling case for causing more violations in those
cases than just being silent.  Further, it would cause ratholing
discussions on whether folks see those use cases being really required
or not, even though the specifics (and discussion about them) is not the
point, just that the operators ARE using them and probably for good
reasons.


Whether vendors fail to support requirements and whether there are good
reasons for their doing so are two different things; ignorance and
apathy can be as likely as real cause. You're right that we don't need
to go into a poll of current products or deployments, but I'm
uncomfortable with asserting that we have to live with it or change the
requirement because vendors don't correctly support it. If that were the
case, there would be no point to docs like RFC2525 (known TCP
implementation problems).

I think you're comparing apples to oranges. RFC2525 seems to describemainly unintentional implementation bugs and similar defects.

The key here is whether the point of this doc is to suggest that the
spec be modified because of real reason, or to document current
violations of spec in this regard. The third option - which is how it's
currently described - is "here's what people do; it's a violation, but
people do it". I don't believe it's appropriate to even indirectly
validate broken implementations because of a herd mentality, so one of
the first two is preferable. I.e., either it has to be "there are real
reasons and the spec should be changed", or "people do it, it's a
violation, and here's where it causes problems".

I don't see "here's what people do; it's a violation, but people doit" as validating.

I don't want to go down to the rathole (in this document at least) oftrying to specify how RFC791 or other similar specs should be changed.It's not the point of this effort.

The spirit of this doc is "people do it, it's a violation, and here'swhere it causes problems". (though currently the text on "here's whereit causes problems" is not too extensive -- feel free to send text)

And I don't buy the fact that people do it because the can't implement
it; ATM segmentation and reassembly works fine at very high speeds too,
and although segments must come in in-order, they are much smaller data
chunks (typically). Sure, it's expensive, but that's not a reason to
violate a spec.

Have you seen 10G ATM? Even 2.5G ATM is very rare. Oh by the way,did you notice that ATM has lost almost all of its operationalrelevance over 5 years ago? There may be a connection here.

It's
even OK to drop packets that are fragments altogether if they hit a
tunnel that won't carry them.


If we assume MTU=1500, would it be OK to drop all the packets with size
1472 (or something thereabouts) which have DF bit set?


Not only "OK", but _required_ as I read the specs. Yes, things that
don't adjust will see a blackhole - which we already know.

Sure, but it's not operationally acceptable that the users see a blackhole.

That's what
we're talking about here (and the unfeasibility of signalling back
"Packet too Big" to all those sources sending >1472 (or whatever)
packets).  While RFC791 says it's OK, it certainly isn't in practise.


http://www.caida.org/analysis/workload/fragments/sdscposter.xml

That shows that "A significant portion of the fragmented traffic that
crosses the UCSD-CERF link is tunneled traffic.". So it's practice by at
least one analysis.

Having seen it on other tunnels all over the place (I do a lot of tunnel
research, FWIW), I would agree that it's possible to hit places that do
violate spec, but it hasn't affected the majority of paths.

Host-to-host tunneling or other tunneling on a small scale in othercontexts certainly can use PMTUD -- also as described in section 3.2.I don't dispute it, and I certainly use it myself.

It just doesn't scale up enough. Let's say for example that Internet2admins did some tunneling between routers -- in such a manner thatPMTUD would be needed for every single packet between every single(src,dst) pair that traverses Internet2. Would you consider it(operationally) OK? I wouldn't -- it would mean that the routerswould need to signal hundreds of millions times a day.

It's still NOT OK to clear the DF bit in the outer header when it's set
in the inner, though.


Sure, sure -- but that's still being done widely out there.  What would
you prefer? Be hush-hush about it?  I want to bring the problems out in
the open, with appropriate disclaimers of course.


As above, there are three options:

        1. argue that the spec should be changed for cause
        2. argue that the violating implementations should be changed

"Everyone does it", or even "many implementions do it" isn't a
sufficient reason to change a spec that has compliant alternatives (drop
the packets) especially when the correct operation of some protocols
(PMTU) depend on that behavior.

These options are different than the ones you said above; I'm finewith ""here's what people do; it's a violation, but people do it" butneither of the two above options is acceptable to me. I don't go tothe reasons (again) because we're obviously going circles on this.

I have no control on why the authors didn't continue the work; I
certainly provided some suggestions for enhancements myself (note: if
the authors are listening, I could consider picking up the draft if
you'd like).  But that aside..

Requiring hundreds or thousands of IP aliases to overcome IP ID wrapping
is not a [real] solution.  Could you please state a better one if one
exists?


Huh? You want to change everyone's spec for tunneling, but the IP ID
space is not on the table? Why not have a 'large ID option'?

I'm pretty sure you know the study which showed that in most cases, IPoptions caused packets to be dropped. Hardware implementations (e.g.,ACLs etc) also don't like IP options AFAICT.

Besides, most hosts don't have an IP ID problem; even though gigabit
interfaces are common, the wrap problem comes up in supercomputer
contexts primarily.


As said, the problem is not on the hosts but routers.

Wrt. the incompliancy, are you referring to this:

    The choice of the Identifier for a datagram is based on the need to
    provide a way to uniquely identify the fragments of a particular
    datagram.  The protocol module assembling fragments judges fragments
    to belong to the same datagram if they have the same source,
    destination, protocol, and Identifier.  **Thus, the sender must choose
    the Identifier to be unique for this source, destination pair and
    protocol for the time the datagram (or any fragment of it) could be
    alive in the internet.**

    It seems then that a sending protocol module needs to keep a table
    of Identifiers, one entry for each destination it has communicated
    with in the last maximum packet lifetime for the internet.


It needs to ensure that the first paragraph isn't violated, which a
later paragraph addresses via a simpler mechanism that is more typical
in current use:
   However, since the Identifier field allows 65,536 different values,
   some host may be able to simply use unique identifiers independent
   of destination.

That doesn't help in practice with high-speed encap/decap with tworouters because all (most) tunneling happens between fixed endpoints.Making the identifier space (src,dst) -specific doesn't help here.

(esp last sentence of 1st paragraph and 2nd para.)

I don't see any real solutions here.  I don't think it's acceptable to
refuse to send any more packets to destination X until an ID slot is
freed (a buffering problem, a denial of service issue), as it is not
acceptable to switch source IP addresses because the peer would likely
no longer recognize the session because the IPs changed.


If it's not acceptable, then we need to redo RFC791. Right now, even
when DF is set, the rule still applies.


I encourage you to write a draft on this.

Note that although RFC791 defines the ID field as for fragmentation, it
never says that if the DF bit is set that the frag  field is not
meaningful or may be set to 0.

RFC1122 does indicate that the IP ID field can be used by routers to
omit duplicate IP packets:

               (2) a
                congested gateway might use the IP Identification field
                (and Fragment Offset) to discard duplicate datagrams
                from the queue.


I don't see how this is relevant.

Again my question would be, what changes would you like to see in the
draft to make this clearer?


I made some above...

I welcome specific suggestions (as in, "in section x, paragraph y,change [this] text to [that]) within the scope constraints of thedocument.

As it is, I will not to make major changes to the scope and focus ofthe document (against others feedback), so if you have minor changesyou'd like to see in mind, please state them clearly.

OK.  I tried to see how to add that in the text, but couldn't figure a
way how to incorporate so it would fit in nicely.  Could you propose
which exact clarification wording (and where) would be helpful?


It may be in multiple places. The part where you mention setting the DF
needs to always examine the MF bit. This is an issue only in headers you
do not generate - i.e., for the inner header. The case where it seemed
ambiguous, and a simple way to fix it:

--- existing ---
  When desiring to avoid fragmentation, IPv4 allows two options: copy
  the DF bit from the inner packets to the encapsulating header, or
  always set the DF bit.  The latter is better especially in controlled
  environments, because it forces PMTUD to converge immediately.
--- proposed ---
  When desiring to avoid fragmentation, IPv4 allows two options: copy
  the DF bit from the inner packets to the encapsulating header, or
  always set the DF bit of the outer header.  The latter is
  better especially in controlled
  environments, because it forces PMTUD to converge immediately.
---


Changed.

--
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

_______________________________________________
Int-area mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/int-area

Re: [Int-area] Please review: draft-savola-mtufrag-network-tunneling-04.txt

Reply via email to