Hello Miklos, >> According to the AT86RF230 datasheet, a transition from P_ON to TRX_OFF >> has a typical duration of 880µs. The driver uses a vale of 510 µs (which >> can be found in Atmels AT86RF230 software programming model) > > Actually, the init code waits twice for 510 us, first to get clock, > then we wake it up, wait more.
2*510 > 880 is right, but the sequence matters. There is a wait block before anything else is done, then the chip is reset for 6µs. Immediately after this reset, the state transition from P_ON to TRX_OFF gets issued, after 510µs (not 880µs) the driver threats the chip like if its in TRX_OFF. And because i already mentioned reset: >> The reset timings appear to differ from the datasheet, too (and here >> again, the software programming model tells something different). >> According to the datasheet, the device requires a typical value of 120µs >> after a reset condition until it is operational again. >>> During the reset procedure the SPI interface shall be inactive ( SEL = H; >>> SCLK = L). >> (AT86RF230 Datasheet) > > And I think we wait longer, no? The timing for RESET->P_ON is not specified in the datasheet, but from a conservative point of view i would use the 120µs noted for RESET->TRX_OFF. The datasheet just writes about 625ns after releasing RESET, however, the current implementation does not wait at all. >> This leads directly to the main issue i would like to discuss: The whole >> driver behaves very optimistic and uses the typical values found in the >> datasheet as worst-case values. I have not seen any countermeasures for >> cases in which the radio device uses more time than this typical value, >> in those cases the whole radio communication might (and in fact, does if >> provoked in simulation cases) lock up. >> I have not found any line of code where the device state gets read >> before a state change command is issued, and no hard timeouts for cases >> where the radio is in a different state than expected. > > I would be very interested in learning where such measures could be > provided. The problem with all this is the following: what happens if > the hardware never does what you want. How long are you going to wait? > In general I did not want to set timers, because that consumes > resources (and all higher level code shares a single alarm with the > driver). Worst case limits: I think table 7-2 (Block settling time) in the datasheet might build a foundation for some reasonable timeouts. If the radio does not fulfil those specifications, it should be threatened as defective, imho. I do not know much about TinyOS internals. I think, there is no way around some timer-based polling. The Watchdog you mentioned later is just a slightly different approach ... >> To sum it up: The driver uses typical timing values as worst-case >> timings and does not follow any conservative approach (as stated in the >> datasheet on page 21). >>> The radio transceiver state is controlled by two signal pins (SLP_TR, RST ) >>> and the >>> register 0x02 (TRX_STATE). A successful state change shall be confirmed by >>> reading >>> the radio transceiver status from register 0x01 (TRX_STATUS). >> (AT86RF230 Datasheet) > > That is true. > >> Despite of all this issues, the driver appears to work on real devices. >> I assume the datasheet contains values with rather large safety margins, >> but in my opinion, violating those specifications is not good. >> Especially because the datasheet provides the only mandatory device >> characteristics for building a simulation. > > I agree, that it is not good practice to violate those limits. There > are other reasons that a radio driver can lock up, and that what has > caused a lot of trouble for me. In certain situations the whole chip > can lock up, especially when you command it to do something, but at > the very same moment (in around 1 us window !!!) an incoming message > is received, then the internal state machine of the chip can lock up, > and will never receive the message, nor complete the transmit. Only a > reset helps. This seems to be fixed in recent IRIS motes with a new > hardware revision of the chip (I have reported this to Atmel but did > not get a reply). Similar issues prop up with the RF212 and RFA1 > drivers. Strange. The errata for RF230 rev. A lists some nasty silicon bugs but nothing targeted against your issue. As far as i understood the datasheet, i would expect the device to do the transition as soon the message has been received. Just because i am curious: Have you tried FORCE_TRX_OFF, too? What did the TRX_STATUS subregister indicate? > So based on all this, it seems that a watchdog component should be > installed above all radio drivers (especially in production mode), > which would trigger the reset of the radio is some timing constraints > are not met. I did not do this yet. So my plan is to make sure that > the radio can be reset/restarted at any moment reliably, and then do > this watchdog component. What do you think? Sounds reasonable. After you wrote about a watchdog, i got an idea for some two-and-a-halve-strikes approach. First strike: Timeout violation, kick off some countermeasures in the driver, trying to avoid a whole reset and instead just recover from some minor timing issues (like nontypical behaviour). For example, this could be the place for some state checking/validation/resynchronisation between driver and radio, allowing a re-triggering of missed commands. Second strike: Big lockup, do a hard reset of the radio device and a (maybe) expensive initialization. Post second "Not really a strike": Radio is broken, do some last effort to disable it and do not touch it any more afterwards . Maybe this approach is too complex/expensive, but it would allow a fine grained error handling. Regards, Markus _______________________________________________ Tinyos-help mailing list [email protected] https://www.millennium.berkeley.edu/cgi-bin/mailman/listinfo/tinyos-help
