On 26-jul-2005, at 13:46, Perry Lorier wrote:
6. Minimise the resources used.
Agree, except that packets are cheap on a 1000 Mbps LAN, so those
don't
count much towards 6.
Packet rate however starts becoming a problem at faster speeds, at
gige
it starts becoming a problem for hosts to deal with unless they are
careful. And not all networks are fast, 3G networks are becoming more
prevalent. We should not waste resources needlessly :)
Well, the places where jumboframes are worth the trouble are also the
places where a handful of packets won't make a difference. I'm not
sure how fast 3G is, but I believe not more than a few Mbps, so
jumboframes really aren't very useful there because they occupy the
channel for too long. Doubly so on radio networks with their high bit
error rates.
What happens on l2's where not every node can see every other node?
Neighbor discovery fails?
Host A can talk to Host B ok.
Host A can talk to Host C ok.
Host B can't talk to Host C.
This happens in ad hoc wireless networks. With your system I'm not
entirely sure how you deal with who's "turn" it is next if not all
nodes
can see all other nodes. Host A should still be able to talk to
Host B.
Well, a simple way to decide could be a log of the difference in MAC
address. So after host 20 sends its packet, host 28 would wait for 3
seconds and host 36 for 4 seconds. But host 36 hears host 28 and
resets its timer to 3 seconds. If hosts 28 and 36 can't hear each
other, host 36 will send its packet 1 second after host 28 rather
than 3 seconds. No big deal.
If A and B are talking to each other and C and D are talking to each
other, why do (A and B) need to talk to C and D?
Ah, but how do you know that A doesn't talk to D, and is never
going to?
How do you know it will in the time before the topology of the network
changes? Given that the topology of the network changes every time a
host comes and goes, the chance that you'll want to talk to most of
the
users during time is rather low.
Look at it this way: if two routers send out RAs every 10 seconds,
that's one packet every 5 seconds. If 60 hosts all send one packet
every five minutes, that's also one packet every 5 seconds.
I'd start at the minimum "MTU" size.
Yes, I thought about this and first trying a 1508 byte packet makes
sense: if jumboframes don't work, you've wasted as little time and
bandwidth as possible. If they do work, you've only wasted 1508 bytes.
A colleague of mine (Matthew Luckie) has done some research into path
MTU's. He has a work in progress paper (
http://www.wand.net.nz/~mjl12/debugging-pmtud.pdf ) where he
enumerates
all the common MTU's he's seen on the Internet.
And reaches a very interesting conclusion! Exchanging per-neighbor
MTUs would really help here.
I'd start with a similar table trying the lowest size, and sending
that,
if it's received try the next lowest size and so on until you don't
get
a reply. When you don't get a reply try the previous-mtu-that-
worked+1,
if that succeeds start a binary search between previous-mtu-that-
worked
and the one that didn't.
I partially agree. If you're at a well known boundary and want to
search upward, it makes sense to try that well known boundary +
minimum increment (I say: 4) first. That way, if you can't go beyond
the current boundary, you know so immediately. Next is the highest
possible value. If you can use that one, you're done.
But if previous low + minimum works but maximim doesn't, a mostly
binary search still makes sense. However, it could be a "hinted"
binary search. For instance, if you're searching between 1508 and
9000 (with the target being 4464) a strict binary search would do:
1 1508 yes
2 9000 no
3 5252 no
4 3380 yes
5 4316 yes
6 4784 no
7 4548 no
8 4432 yes
9 4488 no
10 4460 yes
11 4472 no
12 4464 yes
13 4468 no
A hinted binary search could be:
1 1508 yes
2 9000 no
3 4470 no (closest value to binary 5252 target)
4 2048 yes (closest value to binary 2988 target)
5 2052 yes (see if 2048 was our limit)
6 4352 yes (closest value to binary 3260 target)
7 4356 yes (see if 4352 was our limit)
8 4464 yes (closest value to binary 4412 target)
9 4468 no (4464 was our limit)
Note that although the second variant is faster overal, the first one
finds a reasonable candidate (that can already be used at that point)
at try 5, and the second one at try 6.
In this case your serial bottom-to-top search would probably be a bit
faster, but it has two disadvantages: it takes a long time to find a
high MTU, and it's not good at finding non-standard MTUs.
For a "common" MTU, you only have to endure two timeouts (the next
highest common MTU, and the +1 test). For an uncommon MTU you can
increase the MTU to maximum "common" MTU that's lower than your MTU
quickly, and can endure the timeouts from then on.
Note that with a 100 ms timeout (more than enough) you're done in
less than 2 seconds worst case.
So when system A tries with 3000 bytes (worked with C!) towards B,
B sets an
ack flag and tries with 9216, which fails, so A sends a NAK and tries
with 6108, and so on.
Hang on, if they don't receive a packet, how can they know to send a
NAK? if they're just waiting for a timeout how can they know if the
packet got lost on the way there or on the way back?
Good question. :-)
If instead of using special "ICMP MTU Probes" we use "ICMP Echo
request"
/"ICMP Echo Reply" messages, there is no changes to any packet formats
needed, all it needs to be done is have implemented in a TCP/IP stack,
and the concept is even reusable for IPv4. Other hosts don't even
have
to be upgraded to support this either. magic!
You mean, rely just on ICMP and not announce a bigger MTU in RAs?
I guess you're right, but I wouldn't want to be a 10 Mbps host in an
otherwise 64k jumbo-enabled network, because all those probes would
eat up my bandwidth even though I can't successfully receive them.
Also, I think we want to be nicer to on-link probers than off-link
ones, especially with these large packets.
Stacks would be free to do as you suggest (doing a binary search)
or as
I suggest (ramp up and do a binary search only as a last resort).
Yes, this can be left up to the implementers.
So the general approach would be:
* If a packet arrives from a host that is larger than the cached
MTU for
that neighbour, increase it to the size of the packet arriving.
Not sure if we want to do this check for every packet. Also, an
attacker could fake the packet in order to do an "MTU attack" on a
non-jumbo enabled host.
* When receiving a ND (but not a NS!), and you have no cached MTU for
that neighbour, you start the MTU discovery process (using any
mechanism
for selecting the packet sizes the implementation deems appropriate
(ie,
either yours, or mine, or if someone can come up with a method thats
even better than ours, they could use that!)
With an MTU option in it. And why not NS?
No, the announcement "the switch can handle 4500 bytes" wouldn't have
anything to do with "I can handle 1500".
Which switch? I live in a flat with 3 other people, we have at
least 4
devices that act like switches on one segment. (2 switches, a voip
phone (you can daisy chain a PC off it), and an AP). I have no idea
what the maximum MTU of all those switches are
If all of those switches announce their MTU, we're in business.
On the other hand, if we do an MTU search we don't need this
information because we'll find out ourselves.
If we don't do an MTU search and the switches don't announce their
MTU, you're probably not going to use jumboframes on such a network...
It would be even better if we could ask the switch what our port
supports, but I'm not sure how to do this in such a way that a switch
that doesn't support this protocol floods the request so the results
are meaningless.
Hrm, so Ethernet has capability negotiation (which is how speed,
duplex,
pause frame support etc is negotiated). I have no idea if it says if
the switch supports jumbo gram, IEEE specs make my head hurt.
Autonegotiation only does 16 bits or something like that, no room to
include the MTU there. Gigabit does have some in-band stuff like flow
control, maybe that can be reused. But you always run the risk that a
dumb switch just forwards those packets and screws up the negotiation.
--------------------------------------------------------------------
IETF IPv6 working group mailing list
[email protected]
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------