I digged a bit in the dma implementation for spi and found a potential
bug that makes the stack hanging when duty-cycling.
It is related to the former problem where the spi bus was not properly
released.
A infinite loop can occur in the following scenario (using dma):
1. a dma transfer is initiated (either from cc2420receivep or
cc2420transmitp)
2. in the middle of the transmission, the dutycycling mechanism turns
off the chipcon radio. as the dma controller has not transfered alle
bytes, it is still waiting for an usart interrupt
3. the radio is powered on again and the command for starting the
oscillator is sent via spi. now, the dma controller sees the UTXIFG0
flag and copies the received byte to the destination memory of the
pending dma transfer. because the flag is cleared after this operation,
the spibyte.write command waits forever for a flag.
as proposed earlier, this problem could be solved with a change in the
start/stop procedure in transmitp/receivep (e.g. allowing to stop the
radio only if no dma transfer is ongoing)
another approach would be to introduce some kind of spipacket.cancel()
command to clear the dma transmission when the chipcon radio is stopped.
other suggestions?
Roman
Roman Lim wrote:
Hi David
I've got now a version of the stack that does not crash in my test
application (at least i did not observed any crashes).
With the attached changes of CC2420SpiImplP.nc, it worked fine, but
only without dma support.
I added some additional checks for the resource sharing. Before that,
it could happen, that a the resource (provided by CC2420SpiImplP.nc,)
could be released before the granted event could be signaled (when
m_holder was already set, but SpiResource.granted has not signaled).
Roman
David Moss wrote:
Roman -
Just wanted to confirm I got your email. Your new observations are
definitely valuable in solving this problem. On my end, I'm kind of
swamped
until the middle of next week. After that point, duplicating and
resolving
this issue will be one of the top todo's on my list.
Thanks again,
-David
-----Original Message-----
From: Roman Lim [mailto:[EMAIL PROTECTED] Sent: Wednesday, April
11, 2007 1:18 AM
To: David Moss
Cc: 'Philip Levis'; 'Jonathan Hui'; [EMAIL PROTECTED]
Subject: Re: [Tinyos-help] tinyos2 cc2420lpl locked-up node
Hi David
I tried out your fix. the problem still occurs, even if i decrease
the lpl receive check interval (i used 300 ms).
I made a few additional observations (maybe they could help fixing
the problem):
if i decrease the time the usart-resource is occupied by another
component than the radio stack, the time until the first node freezes
increases
i added a test variable in cc2420controlp to find the exact line in
the code, where the lockup uccurs. the code hangs when starting the
cc2420 at this line (162):
call IOCFG1.write( CC2420_SFDMUX_XOSC16M_STABLE <<
CC2420_IOCFG1_CCAMUX );
just after the state variable has been set to S_XOSC_STARTING.
Roman
David Moss wrote:
Roman,
Thanks for sending us this code. Two things:
1. I independently had a problem with sending data from
serial->radio and
back quickly (packets were being lost constantly, everything was going
very,
very slow to the point of failure. I wasn't able to lock up any nodes
though). An update to CC2420AckLplP to fixed the problem (attached -
also
on
contribs). This CC2420 radio stack has been integrated into my
deployment
applications and tested with no issues so far.
2. I haven't had a chance to test this update to see if it affects your
code. If the update doesn't work, how does decreasing the LPL receive
check
interval affect your nodes? 100 ms Rx checks might be pretty fast
if you
have a bunch of other things running, as well as accessing the
UART. Each
receive check lasts for approximately 1 to 5 ms and is completely
atomic-blocked off in order to shut the radio off as fast as possible.
Thanks again for finding and bringing this issue up. It's good to know
what
issues might be still open in the up and coming 2.0.1 release.
-David
-----Original Message-----
From: Roman Lim [mailto:[EMAIL PROTECTED] Sent: Thursday, April
05, 2007 2:08 PM
To: David Moss
Cc: 'Philip Levis'; 'Jonathan Hui'; [EMAIL PROTECTED]
Subject: Re: [Tinyos-help] tinyos2 cc2420lpl locked-up node
I'm using current CVS and also david's new stack (as mentioned).
I wrote a simple test programm (attached, dma-usage and
lowpowerlistening was defined in the makefile) that reproduces this
behaviour.
In my testbed (consisting of 13 nodes), this code makes them hanging.
Roman
David Moss wrote:
This does sound like a different issue than what we've seen before.
What's the best method to try to duplicate it? Do you simply have
a node
idly duty cycling at 100 ms? Are there other transmitters nearby? Is
the
node that locks up trying to transmit?
It's also surprising that the DMA version crashes too. I'll do what I
can
to look into this.
-David
-----Original Message-----
From: Roman Lim [mailto:[EMAIL PROTECTED] Sent: Thursday, April
05, 2007 3:10 AM
To: David Moss
Cc: 'Philip Levis'; 'Jonathan Hui'
Subject: Re: [Tinyos-help] tinyos2 cc2420lpl locked-up node
Hi
I have a follow-up on this issue. I've been using this cc2420 stack
extensively (together with ctp) the last few month.
I also use a component that communicates over uart0, so I used
resource arbitration.
I noticed, that nodes (tmote sky) hang again in a loop (the same
while loop in SpiByte.write) after a few minutes, even if I use the
spi implementation with dma.
(Tested with the latest CC2420 stack in tinyos-2.x-contrib / rincon
/ tos / chips / cc2420 , SleepInterval set to 100ms)
The problem seems to be a bit different to the last one:
The resource is now properly granted at the time the loop occurs
(at least this is what the state variables tell)
I think, the problem occurs somewhere in the dutycycling radio
code, as
the state of CC2420ControlP is S_XOSC_STARTING
m_state in CC2420ReceiveP and CC2420TransmitP are both S_STOPPED
Roman
David Moss wrote:
Done deal. Changed them to StdControl.
-David
-----Original Message-----
From: Philip Levis [mailto:[EMAIL PROTECTED] Sent: Tuesday,
March 27, 2007 10:04 AM
To: David Moss
Cc: 'Jonathan Hui'; [EMAIL PROTECTED]
Subject: Re: [Tinyos-help] tinyos2 cc2420lpl locked-up node
On Mar 27, 2007, at 9:49 AM, David Moss wrote:
True, but there currently are no interrupts that would turn off
the radio.
The AsyncStdControl commands are only called from SplitControl,
which is not
asynchronous. Should we handle the case in the future where an
interrupt
might disable the radio? What would be the plan of attack?
Then it could be StdControl instead of AsyncStdControl, and
everything is a lot easier.
If it's ASC, then I think it's a good idea to make sure the
implementation of the function is correct given its signature:
otherwise future users who think it's OK to call stop() in an
interrupt will be burned and have to spend weeks figuring out
what's going on.
So I think the two options are to either say that interrupts
can't disable the radio (StdControl), or say they can
(AsyncStdControl) and implement it safely. The former is a lot
easier...
Phil
_______________________________________________
Tinyos-help mailing list
[EMAIL PROTECTED]
https://mail.millennium.berkeley.edu/cgi-bin/mailman/listinfo/tinyos-help