Re: [Linaro-validation] Health check failures

Dave Pigott Tue, 16 Oct 2012 10:51:14 -0700

On 16 Oct 2012, at 17:22, Alexander Sack <[email protected]> wrote:

> +anmar
> 
> On Tue, Oct 16, 2012 at 5:59 PM, Andy Doan <[email protected]> wrote:
>> On 10/16/2012 02:26 AM, Lee Jones wrote:
>>> 
>>> On Mon, 15 Oct 2012, Andy Doan wrote:
>>> 
>>>> On 10/15/2012 01:04 PM, Alexander Sack wrote:
>>>>>>>>> 
>>>>>>>>> --------------------
>>>>>>>>> snowball06/08
>>>>>>>>> --------------------
>>>>>>>>> http://192.168.1.10/lava-server/scheduler/job/35179
>>>>>>>>> 
>>>>>>>>> eth0 failed to come up. We see this a lot with snowballs.
>>>>>>> 
>>>>>>> 
>>>>>>> "We see this a lot" -- do we have actual numbers?  To everyone:
>>>>>>> assuming
>>>>>>> not, what can we do to get some?
>>>> 
>>>> 
>>>> I keep the log of health check failures at:
>>>> 
>>>> 
>>>> 
>>>> https://docs.google.com/a/linaro.org/spreadsheet/ccc?key=0AnxpY5uv-BlNdG9zYTdDLWZWRVFGaWFxQzRLNWtaNmc#gid=8
>>>> 
>>>> In the past 5 days its happened 4 times on snowball.
>>>> 
>>>> Prior to that. In a span of 25 health failures snowball accounted
>>>> for 8 of the failures. Half of those failures look like this
>>>> problem. So this snowball issue is accounting for around 16% of our
>>>> health check failures.
>>> 
>>> 
>>> So it works sometimes, but not others? Sounds like a h/w bug.
> 
> could be hwbug, but driver bugs can also give undeterministic
> behaviour in full system stacks from what i experience (racy things
> etc.). Since we are in software business I feel we should look closer
> at the software side before disregarding something as hwbug ...
> 
> How can we nail the source of this? Maybe we have a kernel that we
> have the guts feeling is better than the 12.02 and could give that a
> stress test try?


Idea for a plan: We take snowball06 and run loop tests on 12.{03-09} for a few 
days and see if any one seems to behave better than the others?

Dave
_______________________________________________
linaro-validation mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-validation

Re: [Linaro-validation] Health check failures

Reply via email to