Hi Kris:
Resending one lost fragment under the fragmentation sub-layer would be
neat indeed.
IBM's SNA employed error recovery procedures (ERP) at layer 2, in
particular SDLC, LAPB and 802.2. Note that none of these can actually
resend a specific piece and all require orderly delivery. HPR's RTP
could do it but was very confidential, mostly used with IBM's parallel
Sysplex.
TCP supports but does not like out of order. Fast recovery is quickly
triggered.
I'd be interested in discussing the requirements for an ERP at DLL. I'm
not sure that it is in scope for ISA100.11a right now, is it? I do not
mean it's wrong, just that we are considering real issues, and real
solutions.
Pascal
-----Original Message-----
From: Kris Pister [mailto:[EMAIL PROTECTED]
Sent: Sunday, October 28, 2007 10:30 PM
To: Pascal Thubert (pthubert)
Cc: Phy/DLL/Network/Transport group
Subject: Re: [sp100.11a-pdnt] Re: Transport layer requirements; first
cut
* fragmentation
Here's the way that I see fragmentation:
Mote A fragments the packet. The first fragment contains the
fragmentation info and a delete/ARQ mode bit and time limit: "if you
haven't seen all 9 segments in 30 seconds, either {delete what you
have}
or {send me a message saying which ones you're missing}".
These packets work their way through the network, potentially over many
paths, each arriving at its destination potentially out of order but
with over 99.9% probability. The receiving mote reassembles the lot,
and checks the transport MIC on the entire thing (no transport MIC per
segment, just DLL MIC).
If we can't hit 100B of transport payload per packet, we need to go
back
and fix 6LoWPAN (well, actually I already know that we have to fix
6LoWPAN), so your TFTP chunk will take no more than 6 fragments no
matter what the security level is. So the probability of losing one or
more segments is less than 1%. Even a 1280B chunk should have less
than
1% probability of a missing segment. I can't *prove* that, but I have
a
LOT of data from real deployments to support the assertion.
But even if the probability is high that one or more segments is lost,
the timeout for ARQ included in the source makes things simple, and the
ARQ itself keeps things from being too inefficient - you don't throw
out
95% of a chunk, you just ask for the last 5% (if that's the way that
the
sender requested it - some applications may want to explicitly say to
throw them away if they haven't all arrived by a certain time).
Motes can have a default timeout for fragment sets for which they have
not yet received the first fragment.
I'm guessing that these ideas are not new, and that if I knew the IETF
literature better I'd be able to point to an RFC that describes
something like this.
ksjp
Pascal Thubert (pthubert) wrote:
[...]
[Pascal] That was the beginning of this thread. A major drawback of
fragmentation is that if most frags for a packet make it but not all
of
them, the whole effort is wasted because the packet can not be
reassembled and it will retried or lost. More than that, resources
are
locked on the receive side for some time waiting for the frag that
will
never come. If fragments are not in order, the situation is very hard
to
even detect. So it is generally better to send a flow on a same path
so
that if something goes wrong a given flow is fully impacted or not
impacted at all.
Say we do a file transfer for software image to a mote. TFTP will
chunk
it into 512 bytes TPDUs. Those will make around 8 or 9 6LoWPAN
fragments
if security is high. If the fragments are disseminated on many
different
path and one fragments gets lost because one path is faulty, the
effort
for carrying all the other fragments is more than a waste. It
actually
locks resources in the receiving mote for a long time. If we have
less
that 90% reliability, that means that most packets will never make it
through integrally. If we have a number of transfers going on, it
might
happen that the full bandwidth of the network is used, and nothing is
actually received at transport layer, and all the available memory is
locked in the motes.
Obviously, if 99.999% packets make it through thanks to path
redundancy
and retries, then the problem is much less visible and the benefits
of
the simple mechanism that Rick detailed prevail. I'm just unsure
where
the limit is, so I'm pointing out the potential threat and proposing
classical techniques in place for that containing that risk. Maybe we
can prove it's not needed and I'm sorry for wasting your time but at
least we can document the case; or maybe it might happen to be needed
and I proposed a simple hash to mitigate the risk. There are also
more
complex solutions in the art of mesh that use Forward Error
Correction,
network coding, or even an LLC with Error Recovery Procedure (like
802.2) to alleviate that risk...
Pascal