Re: Project Stockwell - February 2017 update

2017-02-15 Thread Joel Maher
I wonder if we could make a single link in orangefactor that would give you
the range of TEST-UNEXPECTED-FAIL messages to help with this.  I filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1339937 to track this, Please
do offer suggestions/use cases for that specific bug.

On Tue, Feb 14, 2017 at 5:24 PM, L. David Baron  wrote:

> On Tuesday 2017-02-07 21:33 -0800, Bill McCloskey wrote:
> > I spent about an hour tonight trying to debug a test failure, and I'm
> > writing this email in frustration at how difficult it is. It seems like
> the
> > process has actually gotten a lot worse over the last few years (although
> > it was never good). Here's the situation I ran into:
>
> Another aspect of debugging test failures that has gotten worse
> recently:
>
>  * If you have an intermittent that's actually affecting the tree,
>it's become harder to see the range of TEST-UNEXPECTED-FAIL
>messages that are occurring.  These used to be present in the
>comments that tbplbot made on bugs, but now it requires following
>a link for each log in the orangefactor interface.  (Having this
>range was useful to me today in fixing 1159532, although clicking
>through to 6 logs was sufficient to help understand the problem.)
>
>This also makes it much harder to tell if bugs are being
>mis-classified (e.g., two different problems being starred into
>one bug).
>
> (I thought the point of structured logging was to make it easier to
> get this sort of data.)
>
> -David
>
> --
> 턞   L. David Baron http://dbaron.org/   턂
> 턢   Mozilla  https://www.mozilla.org/   턂
>  Before I built a wall I'd ask to know
>  What I was walling in or walling out,
>  And to whom I was like to give offense.
>- Robert Frost, Mending Wall (1914)
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Project Stockwell - February 2017 update

2017-02-14 Thread L. David Baron
On Tuesday 2017-02-07 21:33 -0800, Bill McCloskey wrote:
> I spent about an hour tonight trying to debug a test failure, and I'm
> writing this email in frustration at how difficult it is. It seems like the
> process has actually gotten a lot worse over the last few years (although
> it was never good). Here's the situation I ran into:

Another aspect of debugging test failures that has gotten worse
recently:

 * If you have an intermittent that's actually affecting the tree,
   it's become harder to see the range of TEST-UNEXPECTED-FAIL
   messages that are occurring.  These used to be present in the
   comments that tbplbot made on bugs, but now it requires following
   a link for each log in the orangefactor interface.  (Having this
   range was useful to me today in fixing 1159532, although clicking
   through to 6 logs was sufficient to help understand the problem.)

   This also makes it much harder to tell if bugs are being
   mis-classified (e.g., two different problems being starred into
   one bug).

(I thought the point of structured logging was to make it easier to
get this sort of data.)

-David

-- 
턞   L. David Baron http://dbaron.org/   턂
턢   Mozilla  https://www.mozilla.org/   턂
 Before I built a wall I'd ask to know
 What I was walling in or walling out,
 And to whom I was like to give offense.
   - Robert Frost, Mending Wall (1914)


signature.asc
Description: PGP signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Project Stockwell - February 2017 update

2017-02-14 Thread Andrew Halberstadt
Just noticed no one looped back here. Joel filed bug 1337844
 and bug 1337839
. There has been some
discussion there. To summarize, running tests locally is currently
optimized towards "Run all tests related to code in " instead of "Run
all tests in ". Optimizing for one by default, comes
with a trade off on the other.

That being said, I think there is some low hanging fruit that could make
the overall situation better, namely:

* Ability to run manifests in the args, e.g: ./mach mochitest
path/to/manifest.ini (this would bypass subdirs)
* An overall summary (bug 1209463
)
* A mode to prevent multi-Firefox instances from running (we would error
out if e.g multiple dirs or subsuites would be run)
* Advertising/bootstrapping aliases for common configurations, e.g add the
following to ~/.mozbuild/machrc:
[alias]
mochitest-media = mochitest -f plain --subsuite media

I agree that this kind of stuff is important, though can't make promises on
a timeline.
-Andrew


On Fri, Feb 10, 2017 at 10:24 AM, Mike Conley  wrote:

> There's good feedback in here. Are some of these known, jmaher? Are any
> intentional choices, or should we just start turning these into bugs to
> get fixed?
>
> On 08/02/2017 12:33 AM, Bill McCloskey wrote:
> > Hi Joel,
> > I spent about an hour tonight trying to debug a test failure, and I'm
> > writing this email in frustration at how difficult it is. It seems like
> the
> > process has actually gotten a lot worse over the last few years (although
> > it was never good). Here's the situation I ran into:
> >
> > A test is failing on try. I want to reproduce it. Assume that running the
> > test by itself isn't sufficient. I would like to run whatever set of
> tests
> > actually ran together on the test machine in a single Firefox
> invocation. I
> > don't want any more tests to run than those. I can't figure out any way
> to
> > do that.
> >
> > I can pass a directory to |mach mochitest|. But that has a number of
> > problems:
> > - It also runs every subdirectory recursively inside that directory.
> Often
> > that includes way more tests. I can't figure out any way to stop it from
> > doing this. I tried the "--chunk-by-dir" option, but it complains that
> the
> > argument is supposed to be an integer. What is this option for?
> > - |mach mochitest| runs all flavor of tests even though I only want one.
> > There is the --flavor option to disable that. But I have never figured
> out
> > how to use it correctly. No matter what I do, some irrelevant devtools
> are
> > a11y or plugin tests seem to run instead of what I want.
> > - There is a --start-at option that should allow me to start running
> tests
> > near the one that I want. But it never seems to work either. I'm not sure
> > if it's confounded by the two problems above, or if it's just completely
> > broken.
> >
> > We could easily fix this by printing in the tinderbox log the mach
> command
> > that you need to run in order to run the tests for a particular directory
> > (and making that discoverable through treeherder).
> >
> > I want to emphasize that, from a developer's perspective, this is the
> > second most basic thing that I could ask for. (Simply running a test by
> > itself is the most basic, and it works fine.) Running tests by directory
> in
> > automation has been a huge improvement, but we're not really earning the
> > dividends from it because it's so hard to get the same behavior locally.
> >
> > Anyway, sorry about the rant. And sorry to pick on your email. But it's
> > frustrating to see all these advanced ideas being proposed when we can't
> > even get basic stuff right.
> >
> > As an aside, I would also like to complain about the decision to strip a
> > lot of the useful information out of test logs. I understand this was
> done
> > to make the tests faster, and I can "just" check in a patch to add
> > "SimpleTest.requestCompleteLog()" to the intermittent test. But why
> didn't
> > we instead figure out why logging was so slow and fix that?
> Fundamentally,
> > it doesn't seem like saving 50MB of log data to S3 should take very long.
> >
> > -Bill
> >
> > On Tue, Feb 7, 2017 at 9:40 AM,  wrote:
> >
> >> This is the second update of project stockwell (first update:
> >> https://goo.gl/1X31t8).
> >>
> >> This month we will be recommending and asking that intermittent failures
> >> that occur >=30 times/week be resolved within 2 weeks of triaging them.
> >>
> >> Yesterday we had these stats:
> >> Orange Factor: 10.75 (https://goo.gl/qvFbeB)
> >> count(high_frequency_bugs): 61
> >>
> >> Last month we had these stats:
> >> Orange Factor: 13.76 (https://goo.gl/o5XOof)
> >> count(high_frequency_bugs): 42
> >>
> >> For more details of the bugs and what we are working on, you can read
> more
> >> on this recent blog post:
> 

Re: Project Stockwell - February 2017 update

2017-02-10 Thread Mike Conley
There's good feedback in here. Are some of these known, jmaher? Are any
intentional choices, or should we just start turning these into bugs to
get fixed?

On 08/02/2017 12:33 AM, Bill McCloskey wrote:
> Hi Joel,
> I spent about an hour tonight trying to debug a test failure, and I'm
> writing this email in frustration at how difficult it is. It seems like the
> process has actually gotten a lot worse over the last few years (although
> it was never good). Here's the situation I ran into:
> 
> A test is failing on try. I want to reproduce it. Assume that running the
> test by itself isn't sufficient. I would like to run whatever set of tests
> actually ran together on the test machine in a single Firefox invocation. I
> don't want any more tests to run than those. I can't figure out any way to
> do that.
> 
> I can pass a directory to |mach mochitest|. But that has a number of
> problems:
> - It also runs every subdirectory recursively inside that directory. Often
> that includes way more tests. I can't figure out any way to stop it from
> doing this. I tried the "--chunk-by-dir" option, but it complains that the
> argument is supposed to be an integer. What is this option for?
> - |mach mochitest| runs all flavor of tests even though I only want one.
> There is the --flavor option to disable that. But I have never figured out
> how to use it correctly. No matter what I do, some irrelevant devtools are
> a11y or plugin tests seem to run instead of what I want.
> - There is a --start-at option that should allow me to start running tests
> near the one that I want. But it never seems to work either. I'm not sure
> if it's confounded by the two problems above, or if it's just completely
> broken.
> 
> We could easily fix this by printing in the tinderbox log the mach command
> that you need to run in order to run the tests for a particular directory
> (and making that discoverable through treeherder).
> 
> I want to emphasize that, from a developer's perspective, this is the
> second most basic thing that I could ask for. (Simply running a test by
> itself is the most basic, and it works fine.) Running tests by directory in
> automation has been a huge improvement, but we're not really earning the
> dividends from it because it's so hard to get the same behavior locally.
> 
> Anyway, sorry about the rant. And sorry to pick on your email. But it's
> frustrating to see all these advanced ideas being proposed when we can't
> even get basic stuff right.
> 
> As an aside, I would also like to complain about the decision to strip a
> lot of the useful information out of test logs. I understand this was done
> to make the tests faster, and I can "just" check in a patch to add
> "SimpleTest.requestCompleteLog()" to the intermittent test. But why didn't
> we instead figure out why logging was so slow and fix that? Fundamentally,
> it doesn't seem like saving 50MB of log data to S3 should take very long.
> 
> -Bill
> 
> On Tue, Feb 7, 2017 at 9:40 AM,  wrote:
> 
>> This is the second update of project stockwell (first update:
>> https://goo.gl/1X31t8).
>>
>> This month we will be recommending and asking that intermittent failures
>> that occur >=30 times/week be resolved within 2 weeks of triaging them.
>>
>> Yesterday we had these stats:
>> Orange Factor: 10.75 (https://goo.gl/qvFbeB)
>> count(high_frequency_bugs): 61
>>
>> Last month we had these stats:
>> Orange Factor: 13.76 (https://goo.gl/o5XOof)
>> count(high_frequency_bugs): 42
>>
>> For more details of the bugs and what we are working on, you can read more
>> on this recent blog post:
>> https://elvis314.wordpress.com/2017/02/07/project-stockwell-february-2017/
>>
>> Thanks for helping out with intermittent test failures when we ping you
>> about them!
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
> 
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Project Stockwell - February 2017 update

2017-02-07 Thread Bill McCloskey
Hi Joel,
I spent about an hour tonight trying to debug a test failure, and I'm
writing this email in frustration at how difficult it is. It seems like the
process has actually gotten a lot worse over the last few years (although
it was never good). Here's the situation I ran into:

A test is failing on try. I want to reproduce it. Assume that running the
test by itself isn't sufficient. I would like to run whatever set of tests
actually ran together on the test machine in a single Firefox invocation. I
don't want any more tests to run than those. I can't figure out any way to
do that.

I can pass a directory to |mach mochitest|. But that has a number of
problems:
- It also runs every subdirectory recursively inside that directory. Often
that includes way more tests. I can't figure out any way to stop it from
doing this. I tried the "--chunk-by-dir" option, but it complains that the
argument is supposed to be an integer. What is this option for?
- |mach mochitest| runs all flavor of tests even though I only want one.
There is the --flavor option to disable that. But I have never figured out
how to use it correctly. No matter what I do, some irrelevant devtools are
a11y or plugin tests seem to run instead of what I want.
- There is a --start-at option that should allow me to start running tests
near the one that I want. But it never seems to work either. I'm not sure
if it's confounded by the two problems above, or if it's just completely
broken.

We could easily fix this by printing in the tinderbox log the mach command
that you need to run in order to run the tests for a particular directory
(and making that discoverable through treeherder).

I want to emphasize that, from a developer's perspective, this is the
second most basic thing that I could ask for. (Simply running a test by
itself is the most basic, and it works fine.) Running tests by directory in
automation has been a huge improvement, but we're not really earning the
dividends from it because it's so hard to get the same behavior locally.

Anyway, sorry about the rant. And sorry to pick on your email. But it's
frustrating to see all these advanced ideas being proposed when we can't
even get basic stuff right.

As an aside, I would also like to complain about the decision to strip a
lot of the useful information out of test logs. I understand this was done
to make the tests faster, and I can "just" check in a patch to add
"SimpleTest.requestCompleteLog()" to the intermittent test. But why didn't
we instead figure out why logging was so slow and fix that? Fundamentally,
it doesn't seem like saving 50MB of log data to S3 should take very long.

-Bill

On Tue, Feb 7, 2017 at 9:40 AM,  wrote:

> This is the second update of project stockwell (first update:
> https://goo.gl/1X31t8).
>
> This month we will be recommending and asking that intermittent failures
> that occur >=30 times/week be resolved within 2 weeks of triaging them.
>
> Yesterday we had these stats:
> Orange Factor: 10.75 (https://goo.gl/qvFbeB)
> count(high_frequency_bugs): 61
>
> Last month we had these stats:
> Orange Factor: 13.76 (https://goo.gl/o5XOof)
> count(high_frequency_bugs): 42
>
> For more details of the bugs and what we are working on, you can read more
> on this recent blog post:
> https://elvis314.wordpress.com/2017/02/07/project-stockwell-february-2017/
>
> Thanks for helping out with intermittent test failures when we ping you
> about them!
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Project Stockwell - February 2017 update

2017-02-07 Thread jmaher
This is the second update of project stockwell (first update: 
https://goo.gl/1X31t8).

This month we will be recommending and asking that intermittent failures that 
occur >=30 times/week be resolved within 2 weeks of triaging them.

Yesterday we had these stats:
Orange Factor: 10.75 (https://goo.gl/qvFbeB)
count(high_frequency_bugs): 61

Last month we had these stats:
Orange Factor: 13.76 (https://goo.gl/o5XOof)
count(high_frequency_bugs): 42

For more details of the bugs and what we are working on, you can read more on 
this recent blog post:
https://elvis314.wordpress.com/2017/02/07/project-stockwell-february-2017/

Thanks for helping out with intermittent test failures when we ping you about 
them!
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform