On 26-jul-2005, at 13:46, Perry Lorier wrote:

6. Minimise the resources used.

Agree, except that packets are cheap on a 1000 Mbps LAN, so those don't
count much towards 6.

Packet rate however starts becoming a problem at faster speeds, at gige
it starts becoming a problem for hosts to deal with unless they are
careful.  And not all networks are fast, 3G networks are becoming more
prevalent.  We should not waste resources needlessly :)

Well, the places where jumboframes are worth the trouble are also the places where a handful of packets won't make a difference. I'm not sure how fast 3G is, but I believe not more than a few Mbps, so jumboframes really aren't very useful there because they occupy the channel for too long. Doubly so on radio networks with their high bit error rates.

What happens on l2's where not every node can see every other node?

Neighbor discovery fails?

Host A can talk to Host B ok.
Host A can talk to Host C ok.
Host B can't talk to Host C.

This happens in ad hoc wireless networks.  With your system I'm not
entirely sure how you deal with who's "turn" it is next if not all nodes can see all other nodes. Host A should still be able to talk to Host B.

Well, a simple way to decide could be a log of the difference in MAC address. So after host 20 sends its packet, host 28 would wait for 3 seconds and host 36 for 4 seconds. But host 36 hears host 28 and resets its timer to 3 seconds. If hosts 28 and 36 can't hear each other, host 36 will send its packet 1 second after host 28 rather than 3 seconds. No big deal.

If A and B are talking to each other and C and D are talking to each
other, why do (A and B) need to talk to C and D?

Ah, but how do you know that A doesn't talk to D, and is never going to?

How do you know it will in the time before the topology of the network
changes?  Given that the topology of the network changes every time a
host comes and goes, the chance that you'll want to talk to most of the
users during time is rather low.

Look at it this way: if two routers send out RAs every 10 seconds, that's one packet every 5 seconds. If 60 hosts all send one packet every five minutes, that's also one packet every 5 seconds.

I'd start at the minimum "MTU" size.

Yes, I thought about this and first trying a 1508 byte packet makes sense: if jumboframes don't work, you've wasted as little time and bandwidth as possible. If they do work, you've only wasted 1508 bytes.

A colleague of mine (Matthew Luckie) has done some research into path
MTU's.  He has a work in progress paper (
http://www.wand.net.nz/~mjl12/debugging-pmtud.pdf ) where he enumerates
all the common MTU's he's seen on the Internet.

And reaches a very interesting conclusion! Exchanging per-neighbor MTUs would really help here.

I'd start with a similar table trying the lowest size, and sending that, if it's received try the next lowest size and so on until you don't get a reply. When you don't get a reply try the previous-mtu-that- worked+1, if that succeeds start a binary search between previous-mtu-that- worked
and the one that didn't.

I partially agree. If you're at a well known boundary and want to search upward, it makes sense to try that well known boundary + minimum increment (I say: 4) first. That way, if you can't go beyond the current boundary, you know so immediately. Next is the highest possible value. If you can use that one, you're done.

But if previous low + minimum works but maximim doesn't, a mostly binary search still makes sense. However, it could be a "hinted" binary search. For instance, if you're searching between 1508 and 9000 (with the target being 4464) a strict binary search would do:

1  1508 yes
2  9000 no
3  5252 no
4  3380 yes
5  4316 yes
6  4784 no
7  4548 no
8  4432 yes
9  4488 no
10 4460 yes
11 4472 no
12 4464 yes
13 4468 no

A hinted binary search could be:

1  1508 yes
2  9000 no
3  4470 no (closest value to binary 5252 target)
4  2048 yes (closest value to binary 2988 target)
5  2052 yes (see if 2048 was our limit)
6  4352 yes (closest value to binary 3260 target)
7  4356 yes (see if 4352 was our limit)
8  4464 yes (closest value to binary 4412 target)
9  4468 no (4464 was our limit)

Note that although the second variant is faster overal, the first one finds a reasonable candidate (that can already be used at that point) at try 5, and the second one at try 6.

In this case your serial bottom-to-top search would probably be a bit faster, but it has two disadvantages: it takes a long time to find a high MTU, and it's not good at finding non-standard MTUs.

For a "common" MTU, you only have to endure two timeouts (the next
highest common MTU, and the +1 test). For an uncommon MTU you can
increase the MTU to maximum "common" MTU that's lower than your MTU
quickly, and can endure the timeouts from then on.

Note that with a 100 ms timeout (more than enough) you're done in less than 2 seconds worst case.

So when system A tries with 3000 bytes (worked with C!) towards B, B sets an
ack flag and tries with 9216, which fails, so A sends a NAK and tries
with 6108, and so on.

Hang on, if they don't receive a packet, how can they know to send a
NAK?  if they're just waiting for a timeout how can they know if the
packet got lost on the way there or on the way back?

Good question.  :-)

If instead of using special "ICMP MTU Probes" we use "ICMP Echo request"
/"ICMP Echo Reply" messages, there is no changes to any packet formats
needed, all it needs to be done is have implemented in a TCP/IP stack,
and the concept is even reusable for IPv4. Other hosts don't even have
to be upgraded to support this either.  magic!

You mean, rely just on ICMP and not announce a bigger MTU in RAs?

I guess you're right, but I wouldn't want to be a 10 Mbps host in an otherwise 64k jumbo-enabled network, because all those probes would eat up my bandwidth even though I can't successfully receive them.

Also, I think we want to be nicer to on-link probers than off-link ones, especially with these large packets.

Stacks would be free to do as you suggest (doing a binary search) or as
I suggest (ramp up and do a binary search only as a last resort).

Yes, this can be left up to the implementers.

So the general approach would be:
* If a packet arrives from a host that is larger than the cached MTU for
that neighbour, increase it to the size of the packet arriving.

Not sure if we want to do this check for every packet. Also, an attacker could fake the packet in order to do an "MTU attack" on a non-jumbo enabled host.

* When receiving a ND (but not a NS!), and you have no cached MTU for
that neighbour, you start the MTU discovery process (using any mechanism for selecting the packet sizes the implementation deems appropriate (ie,
either yours, or mine, or if someone can come up with a method thats
even better than ours, they could use that!)

With an MTU option in it. And why not NS?

No, the announcement "the switch can handle 4500 bytes" wouldn't have
anything to do with "I can handle 1500".

Which switch? I live in a flat with 3 other people, we have at least 4
devices that act like switches on one segment.  (2 switches, a voip
phone (you can daisy chain a PC off it), and an AP).  I have no idea
what the maximum MTU of all those switches are

If all of those switches announce their MTU, we're in business.

On the other hand, if we do an MTU search we don't need this information because we'll find out ourselves.

If we don't do an MTU search and the switches don't announce their MTU, you're probably not going to use jumboframes on such a network...

It would be even better if we could ask the switch what our port
supports, but I'm not sure how to do this in such a way that a switch
that doesn't support this protocol floods the request so the results
are meaningless.

Hrm, so Ethernet has capability negotiation (which is how speed, duplex,
pause frame support etc is negotiated).  I have no idea if it says if
the switch supports jumbo gram, IEEE specs make my head hurt.

Autonegotiation only does 16 bits or something like that, no room to include the MTU there. Gigabit does have some in-band stuff like flow control, maybe that can be reused. But you always run the risk that a dumb switch just forwards those packets and screws up the negotiation.

--------------------------------------------------------------------
IETF IPv6 working group mailing list
[email protected]
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to