Hi experts, A while back I tried to help Kelvin debug a Roach1 that had been working at the factory and at the customer's site but then went wonky. I wasn't able to help him much; I bet someone out there can. Can you please help me make sense of what I was seeing ? Suggestions for additional tests to run are welcome.
My test notes are in the .txt attachment. I'm not sure what my questions are but they are along the lines of : what does it mean to reset the Roach but not the Actel ? What are the differences in the way roach_monitor.py and reset_roach.sh perform a reset ? Is it just syntax ? Two different ways of doing exactly the same thing or is there an actual functional difference ? Should a new test script and/or roach_monitor program be started if the Roach1 is manually power cycled (to prevent software or hardware from getting into a confused state) ? Can the roach_monitor and low level board test scripts confuse each other or the Roach1 hardware if run at the same time ? maybe the behavior I was seeing was due to a Roach1 w/ a funky Xport or Actel or some other part ? Thanks, Matt
Here's a summary of test results for a board that had previously passed the complete test suite and was shipped to a customer and used for a while but then started to misbehave. A) connect PSU, power up, use hyperterminal and a serial port to wenter uboot manually. Then, interactively program the kermel uImage and file system romfs to Flash. Boot via soloboot works AOK. B) terminate hyperterminal to free up the serial port so that a Roach test machine and software can exercise the board. https://casper.berkeley.edu/wiki/ROACH_test_machine https://casper.berkeley.edu/wiki/ROACH_Bringup starting from board already powered up and at the soloboot unix login prompt. run the test suite's borph_load_tftp.sh test script 9) Load Linux (BORPH) kernel (uImage) and root filesystem (romfs, mini-root) to Flash Attempting to load the Linux kernel and root filesystem to flash via TFTP transfer... Resetting ROACH...done. Powering-up ROACH...done. error: UBOOT did not boot. kill the test script and fire up hyperterminal and don't see anything. don't get any responses to enter key... The PPC network is down but the Xport network is up. The Xport was used to power down and power back up and it worked OK; hyperterminal was used to stop the automatic boot process. C) terminate hyperterminal to free up the serial port. starting from board already powered up and sitting at the uboot prompt. run the test suite's borph_load_tftp.sh test script 9) Load Linux (BORPH) kernel (uImage) and root filesystem (romfs, mini-root) to Flash works AOK. D) immediately after test "C" rerun the test suite's borph_load_tftp.sh test script 9) Load Linux (BORPH) kernel (uImage) and root filesystem (romfs, mini-root) to Flash works AOK. E) complete power down and then turn the PSU to the "on" position but just the Xport is on. run the test suite's borph_load_tftp.sh test script 9) Load Linux (BORPH) kernel (uImage) and root filesystem (romfs, mini-root) to Flash Attempting to load the Linux kernel and root filesystem to flash via TFTP transfer... Resetting ROACH...done. Powering-up ROACH...done. error: UBOOT did not boot. F) stop the test script, start up hyperterminal session. press the Roach1's reset button Roach1 starts up as expected. G) quit hyperterminal run the test suite's borph_load_tftp.sh test script 9) Load Linux (BORPH) kernel (uImage) and root filesystem (romfs, mini-root) to Flash Attempting to load the Linux kernel and root filesystem to flash via TFTP transfer... Resetting ROACH...done. Powering-up ROACH...done. error: UBOOT did not boot. H) Enable the auto "power-on after hard-reset" setting via the RoachMonitor didn't help: Borph_load_tftp.sh failed. then disable the "power-on after hard-reset" mode (return to default mode). didn't help: Borph_load_tftp.sh failed. I) use roach_monitor to do a warm reset to "reset the the Roach but not the Actel" The hyperterminal session still shows nothing. Quit hyperterminal. try tftp step 9 again: Borph_load_tftp.sh passes. J) I then powered down via the PSU switch and left it off for many seconds. Upon powerup roach_monitor would hang. multiple attempts all hang. removing and reconnecting the RJ45 network cable didn't help. The Lantronix device installer found the Xport at the expected address and Borph_load_tftp.sh passed. K) while staying in the same test script session, manually power cycle Borph_load_tftp.sh failed in the usual way. try test script's "reset" 12 Borph_load_tftp.sh failed in the usual way. L) kill the test script and start a new session and a new hyperterminal session. try test script's "reset" 12 serial port is alive and booted soloboot AOK. kill hyperterminal. Borph_load_tftp.sh passed. M) power down manually; kill test script. no hyperterm nor roach monitor running. power up and run a new version of the test script. Borph_load_tftp.sh passed. N) without starting a new test script I manually powered down and then had repeated failures trying to power up via the roach_monitor: Resetting ROACH...not successful...ROACH power state: 0000000 Powering-up ROACH...not successful...ROACH power state: 0000000 trying to reset the Roach1 but not the Actel failed in the same way. stopping roach_monitor and starting a new version didn't help. O) stop test script and roach_monitor and start a new test script and now the test script can power up the board but Borph_load_tftp.sh failed in the usual way. and then I ran out of test time.

