On 12 Nov 2012, at 15:44, Dave Pigott <[email protected]> wrote:

> 
> On 12 Nov 2012, at 15:33, Andy Doan <[email protected]> wrote:
> 
>> On 11/10/2012 03:17 AM, Dave Pigott wrote:
>>> Just the one…
>>> 
>>> ------------
>>> panda03
>>> ------------
>>> http://validation.linaro.org/lava-server/scheduler/job/38289
>>> 
>>> Looks like the board locked up just after the startup animation completed. 
>>> Went onto the board, and it was indeed locked. Hardreset and it came back. 
>>> Put it down to a one off glitch.
>> 
>> Thanks for looking into this. With the new "newline" code the failure 
>> pattern looked different and I wasn't sure what went wrong.
>> 
>> I think we've had this type of failure occur 3 times in the past week on 
>> Panda. I think its becoming our #2 failure reason (more days needed to 
>> really be sure).
> 
> I suspect you're right, and underlining what Michael said the other day: We 
> should define a period (week/month) over which we collect stats, and then the 
> biggest problem in that period gets the attention, and then iterate until the 
> failures are negligibly small and we move to the next highest sample unit 
> (month/quarter -> quarter/year). The data set is now small enough it might 
> warrant a month cycle, but I'm happy to review each week and see where we are.

Thoughts (conjecture):

We don't do a clean boot in two ways:
(1) We do a soft reboot
(2) We use the master image u-boot

Both of these could, in theory, contribute to image lock up

We can fix (1) easily enough, but until we have a reliable sd-mux solution 
and/or switch to boot from USB thumb drive, there's not much we can do about 
that. It might be worth running a soak test on staging with a fix for (1) and 
see if we see any improvements.

Thanks

Dave


_______________________________________________
linaro-validation mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-validation

Reply via email to