So the big question is where is that 127 coming from...

As Ayan pointed out, threaded_memtest can only return a 0 or 1 explicitly via 
the rv variable.
However, check this bit out from memory_test in run_processes() which is only 
called when running in multi-thread mode:

if line and len(line) > 1:
    print "process %u pid %u: %s" % (i, pipe[i].pid, line)
    sys.stdout.flush()
if pipe[i].poll() == -1:
    waiting = True
else:
    return_value = pipe[i].poll()
    if return_value != 0:
        print "Error: process  %u pid %u retuned %u" % (i, pipe[i].pid, 
return_value)
        passed = False
    print "process %u pid %u returned success" % (i, pipe[i].pid)
    pipe[i] = None

First, a minor detail... that second print statement is incorrect.  it will 
ALWAYS be printed, regardless of outcome, which is why the logs show 
Error: process 1 pid 2188 retuned 127
process 1 pid 2188 returned success

More importantly, on these failing systems the subprocess.poll()
function is returning the 127.  So this has to be generated somewhere
between the memory_test script and threaded_memtest.  As long as poll
returns a -1, memory_test keeps waiting for something other than -1.

We also know that threaded_memtest can only return 0, or 1.

So, going on that, perhaps we should change the wait statement to this:

state = pipe[i].poll()
if state != 0 or state != 1:
    waiting = True

I wonder if that 127 is the kernel saying that the process is closing,
but isn't quite done yet (perhaps the kernel is blocking the process
exit while it recovers resources?)

I'm attaching a branch of checkbox to this (for oneiric) that
incorporates this... Can someone test this on a failing system and see
if it resolves the issue?

Coincidentally, from looking at memory_test, I don't think this would
happen when running in a single-thread situation.  I think this will
only affect multi-threaded testing.

Finally, while this seems to be pretty frequent on the Latitude 2120, I
don't think this is machine specific either.  Perhaps this one is "just
slow enough" that we're seeing this... I don't know, but my gut feeling
is that the 2120 just happens to be the right combination of factors to
highlight what's actually a flaw in the test itself.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/823419

Title:
  [Latitude 2120] Memory test fails on Alpha 3

To manage notifications about this bug go to:
https://bugs.launchpad.net/checkbox/+bug/823419/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to