We have a raft of intermittent oranges where we get "exit code -11".

I'm trying to land a patch, which is causing a largely-unrelated set of
tests (identity) to generate the infamous exit code -11 problem (there's
an existing intermittent filed against that test; it had disappeared for
months until recently it popped up a bit).  Note that this failure
*never* results in a crashdump, and I've never seen it locally, just in
Automation.

My changes apparently tweak the timings just a little, and whichever
test runs last in the identity suite permafails - but only on linux64
opt non-e10s (at lower frequency in linux32-opt-non-e10s).  Adding a
patch which changes the order of some releases and what waits on what
causes it to go from permafail to 10-25%.

However, this error is not confined to the code I'm playing with.
Currently a simple search for exit code yields ~75ish open intermittents
that mention exit code -11: http://mzl.la/1TGByje

We need to either get a handle on these, or decide we don't care.  These
have been widespread in the tree for a year now.

There are many more closed as WorksForMe since the test that gets
'tagged' for the failure seems to not have much to do with the
cause. Either it's something about channel/etc shutdown (like mine
appears to be), or it's a landmine, or it's just timing.  Often it will
tag a test for a while, then disappear/move to another test and perhaps
not come back (in that test; eventually people close the bug as WFM).

We *need* to find some solution to it -- even if it's to decide it's a
(safe) artifact of some underlying problem outside of our control.  I'd
far rather find a true cause and either fix or wallpaper it.  But right
now it's stopping me from landing some important code changes.

On the plus side, I have a nice Try run which will cause it 100% of the
time - though when I tried to provoke it on a loaner Test VM after
painfully emulating what's needed to run tests, it wouldn't fail -- but
I don't trust that was a well-setup recreation of a real Try run.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2eb01359621

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
_______________________________________________
dev-platform mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to