On Thu, Oct 11, 2012 at 10:05 PM, Xiaohui Liu <[email protected]> wrote:
> Thanks again for your patient reply. Please CIL. > No problem. You clearly have done your homework and I don't mind working with folks like that. And you are clearly interested in a robust implementation. I want to encourage that. > > On Thu, Oct 11, 2012 at 8:57 PM, Eric Decker <[email protected]> wrote: > >> >> >> On Wed, Oct 10, 2012 at 10:53 PM, Xiaohui Liu <[email protected]> wrote: >> >>> Hi Eric, >>> >>> Thanks for your prompt reply. Please see my comments inline. >>> >>> On Wed, Oct 10, 2012 at 9:12 PM, Eric Decker <[email protected]> wrote: >>> >>>> Welcome to the real world. >>>> >>>> A production/robust stack should handle these issues. I haven't spent >>>> enough time with the current cc2420 stack to know how far it was taken. >>>> >>>> I will one of these days visit this issue because I need a production >>>> quality radio stack for the CC2520 radio. So will dig into it at that >>>> time. >>>> >>>> In the meantime ... >>>> >>>> On Wed, Oct 10, 2012 at 4:05 PM, Xiaohui Liu <[email protected]> wrote: >>>> >>>>> Hello everyone, >>>>> >>>>> Normally, if a packet is corrupted, the CRC is supposed to detect and >>>>> discard it, at least with very high probability. However, if the >>>>> corruption >>>>> happens on the length field, are these issues handled by the radio stack? >>>>> >>>> >>>> This issue is partially handled by the existence of a special sequence >>>> of symbols called the SFD (Start Frame Delimiter). >>>> >>>> Also if the length is corrupted the CRC will cause the packet to >>>> abort. There is a defined maximum length for a packet which will also >>>> cause the packet to abort prior to doing a CRC calculation. So if the >>>> packet is too long but below the max length, the CRC will be computed and >>>> should cause the packet to abort. Seeing the start of another frame >>>> should also cause the packet to abort. >>>> >>> >> >> >>> By aborting a packet, you mean the CC2420 h/w terminates its reception >>> and the packet will not be placed at RXFIFO, or the packet is placed into >>> RXFIFO but flushed by the CC2420 driver later? >>> >> >> yes. Depends on the system design and how it works and what is >> enabled. The CC2420 is very flexible. I don't know what the current >> design has programmed it to do. I do know the mostly the simple default >> implementation was done based on what Chipcon originally provided as the >> reference implementation. Typically reference implementations don't >> handle the corner/exception cases. The problem you are describing is >> certainly one of the exception cases. >> >> >>> 1) the corrupted length is larger than the actual bytes buffered in >>>>> RXFIFO, and RXFIFO.beginRead() or RXFIFO.continueRead() is called to >>>>> read rxFrameLength bytes from it. What could happen in this case? >>>>> Underflow? >>>> >>>> >>>> There should be special signalling between the CC2420 h/w and the >>>> CC2420 driver. This signalling should tell the driver that we have an >>>> abnormal packet. Either we have received another SFD prior to the current >>>> packet finishing or that we have an underflow (we the h/w stopped seeing >>>> symbols prior to its idea of when the packet should be finished). >>>> >>>> Both of those conditions must be handled in a robust driver. I don't >>>> know how far the implementers of the current CC2420 driver took things. In >>>> the little bit of looking I've done, I haven't seen evidence that these >>>> conditions are handled. The CC2420 driver was based (from the looks of >>>> things) on some sample TI/Chipcon code that talks to the chip using some >>>> default setup. And special handling of exception processing has to be >>>> programmed into the chip special. It is actually quite capable chip. >>>> But it has to be because it is an off-load processor. (I used to design >>>> this kind of stuff (inter cpu communications, multi-processors, etc. when >>>> things were much much larger). >>>> >>>> >>>>> >>>>> 2) the CRC is read at the wrong offset ((buf[rxFrameLength] >> 7)). >>>>> And it can happen to be 1, causing the corrupted packet to be signaled to >>>>> upper layers as a correct packet even though it is not. >>>>> >>>> >>>> I don't understand what you are saying is the behaviour. If the CRC >>>> is at the wrong offset, how can it be correct? I think one should get an >>>> indication that the packet is bad. >>>> >>>> >> >>> AFAIK, there is no special signalling you mentioned above between the >>> CC2420 h/w and the current CC2420 driver. This can happen when the h/w >>> passes a packet with corrupted length filed to the driver. For instance, if >>> the correct rxFameLength is 21, but it is corrupted and now becomes 20. >>> >> >> Have you physically seen this happen? >> >> Yes. > Ouch. Well that is certainly a problem that needs to be investigated. Is this easily reproducable? > Only packets with fixed length are sent, but packets with different length > are received by the default driver, passing CRC check. > That shouldn't happen. And indicates a bug in the h/w (actually its a h/w s/w implementation). > Out of all these packets, about 3% are actually corrupted as my > measurement shows, a much higher false positive ratio than what CRC is > supposed to give. This is why I started investigating this issue in the > first place. > > Is there anyway to discard such packets with corrupted length field, maybe > at the h/w level? > There should be. I'm not sure how to get closer to the problem down in the h/w. First, I would try to investigate what stuff can be done can be done using the interface to the cc2420. If something can be done with regards to its configuration. I would also probably at this point start talking to TI (perhaps the E2E forums). > CRC checking does not work well since the offset of CRC bit is wrong. > Well not exactly. CRC actually works very well. The problem (which I just figured out) is an issue with the basic protocol definition for the very lowest layer. Namely, and I don't know why, but the designers didn't make the CRC cover the length field at the h/w level. This was a big mistake and you are seeing how it manifests. I doubt that there is anyway to fix this at the lowest layer (the h/w) and would suggest that the thing to do would be to make the next higher layer compensate for this. This is exactly why when I design stuff, I design in redundancy. I'm extremely experienced in embedded systems and in particular internetwork routers and protocol stacks. I worked on cpus at HP and then was early at cisco. Wrote lots of router code. My suggestion is we should redo the HDLC layer that is just below the AM layer to put in a redundant length.
_______________________________________________ Tinyos-help mailing list [email protected] https://www.millennium.berkeley.edu/cgi-bin/mailman/listinfo/tinyos-help
