Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread James Graham

On 15/09/17 18:45, Dan Mosedale wrote:

I wonder if this isn't (in large part) a design problem disguised as a
behavior problem.  The existing try syntax (even with try chooser) is so
finicky and filled with abbreviations that even after years of working with
it, I still regularly have to look up stuff and sometimes when I've been in
a hurry, I've done something more general than I really needed because it
was just too painful to figure out the exact thing.

I'd be pretty surprised if developers newer to the mozilla infrastructure
than I didn't end up doing this sort of thing substantially more frequently.

https://ahal.ca/blog/2017/mach-try-fuzzy/ seems like a fine step in the
right direction, and maybe that'll be enough.

But I do wonder if the path to saving substantial time and money in the
long run is to invest some real user-research / UX / design time into
designing a try configurator where it requires effort to do the
unnecessarily expensive thing, as opposed to the current situation, where
it requires effort to avoid the expensive thing.


I think that's a rather uncontroversial opinion. Historically we have 
been hampered by the fact that the set of try jobs was basically unknown 
and constantly changing, and the code was scattered across many 
repositories. Now that taskcluster defines everything in a single place 
and the majority of the code is in-tree it will be much easier to 
experiment with different frontends that make it easy to select the 
right jobs. That's what allowed ahal to write |mach try fuzzy|.


There is also a desire to have better change-based job selection, so 
that the default behaviour can be "run the jobs that are most likely to 
be affected by the change I just made".


However all of these improvements will take time, and in the meantime 
there are problems being caused by too-high backlog, so some changes in 
user behaviour will be helpful as we work toward better tools.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread Dan Mosedale
I wonder if this isn't (in large part) a design problem disguised as a
behavior problem.  The existing try syntax (even with try chooser) is so
finicky and filled with abbreviations that even after years of working with
it, I still regularly have to look up stuff and sometimes when I've been in
a hurry, I've done something more general than I really needed because it
was just too painful to figure out the exact thing.

I'd be pretty surprised if developers newer to the mozilla infrastructure
than I didn't end up doing this sort of thing substantially more frequently.

https://ahal.ca/blog/2017/mach-try-fuzzy/ seems like a fine step in the
right direction, and maybe that'll be enough.

But I do wonder if the path to saving substantial time and money in the
long run is to invest some real user-research / UX / design time into
designing a try configurator where it requires effort to do the
unnecessarily expensive thing, as opposed to the current situation, where
it requires effort to avoid the expensive thing.

​Dan​


2017-09-15 9:46 GMT-07:00 Geoffrey Brown :

> Masayuki, your try push had trouble because you requested
> "mochitest-2" instead of "mochitest-e10s-2". Non-e10s mochitests only
> run on Android and Windows now. You probably wanted something like:
>
> https://treeherder.mozilla.org/#/jobs?repo=try=
> d68382f17d63f0674c62acc7242a9e406793895f
>
>
> This is a good example of how a small deviation from "correct" try
> syntax can have unexpected and frustrating consequences.
>
>  - Geoff
>
> On Thu, Sep 14, 2017 at 7:15 PM, Masayuki Nakano 
> wrote:
> > I tried to say different point. See the treehearder log, mochitests
> didn't
> > run except on Win7 Debug, Android 4.3 API16+ Opt/Debug. So, try syntax
> > parser or something is really broken. I often meet this kind of bug.
> >
> >
> > On 9/15/2017 10:07 AM, Kris Maglione wrote:
> >>
> >> Your best bet is probably to use `mach try` with a specific set of test
> >> directories. It will generate a set of --try-test-paths flags to
> restrict
> >> tests to those paths, and only run the first chunk of any group. Without
> >> that, groups shift around too much to be reliable.
> >>
> >> On Fri, Sep 15, 2017 at 10:03:00AM +0900, Masayuki Nakano wrote:
> >>>
> >>> Even when I got the chunk numbers, specifying chunk numbers of
> mochitests
> >>> wouldn't work, see this log:
> >>>
> >>> https://treeherder.mozilla.org/#/jobs?repo=try=
> c09c7046ed0664e89f7224e1de5219c39c94c948
> >>> After that, I needed to rerun mochitests with |-u mochitests|. IIRC, I
> >>> tried to kick the specific chunks with "Add new jobs", but didn't work.
> >>> And also, when I try to investigate random oranges which are not
> >>> reproducible on my environments, I want an option like
> |--run-until-failure|
> >>> and |--repeat REPEAT| in the try syntax. Because of no such options, I
> need
> >>> to trigger a lot of jobs manually and that may/might cause too many
> oranges.
> >>>
> >>> On 9/15/2017 1:21 AM, Kyle Lahnakoski wrote:
> 
> 
>  You can try ActiveData, which stores all test results from the past
> few
>  weeks.  Here is an example query that shows the chunk number for each
>  run/build combo in the past day.  ActiveData is sometimes more than a
>  day behind
> 
>  https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu
> 
>  {
>  "from":"unittest",
>  "select":[
>  {"aggregate":"count"},
>  {"value":"action.start_time","aggregate":"max"}
>  ],
>  "groupby":[
>  "run.suite",
>  "run.chunk",
>  "result.test",
>  "build.platform",
>  "build.type",
>  "run.type"
>  ],
>  "where":{"and":[
>  {"eq":{"build.branch":"mozilla-inbound"}},
>  {"prefix":{"run.suite":"moch"}},
>  {"gt":{"action.start_time":{"date":"today-day"}}},
>  {"regex":{"result.test":".*browser_623779.js.*"}}
>  ]},
>  "limit":1000
>  }
> 
> 
> 
>  On 2017-09-14 11:49, Michael de Boer wrote:
> >>
> >> On 14 Sep 2017, at 17:48, Marco Bonardo 
> wrote:
> >>
> >> When I need to retrigger a mochitest-browser test multiple times (to
> >> investigate an intermittent), often I end up running all the
> >> mochitest-browser tests, looking at every log until I find the chunk
> >> where the test is, and retrigger just that chunk. The chunk number
> >> changes based on the platform and debug/opt, so it's painful.
> >> Is there a way to trigger only the chunk that will contain a given
> >> test, so I can save running all of the other chunks?
> >
> > This! This! This! I’d love to be able to do this - would making
> testing
> > possible test failure fixes sooo much easier.
> >
> > Cheers,
> >
> > Mike.
> >
> 
> >>>
> >>>
> 

Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread Geoffrey Brown
Masayuki, your try push had trouble because you requested
"mochitest-2" instead of "mochitest-e10s-2". Non-e10s mochitests only
run on Android and Windows now. You probably wanted something like:

https://treeherder.mozilla.org/#/jobs?repo=try=d68382f17d63f0674c62acc7242a9e406793895f


This is a good example of how a small deviation from "correct" try
syntax can have unexpected and frustrating consequences.

 - Geoff

On Thu, Sep 14, 2017 at 7:15 PM, Masayuki Nakano  wrote:
> I tried to say different point. See the treehearder log, mochitests didn't
> run except on Win7 Debug, Android 4.3 API16+ Opt/Debug. So, try syntax
> parser or something is really broken. I often meet this kind of bug.
>
>
> On 9/15/2017 10:07 AM, Kris Maglione wrote:
>>
>> Your best bet is probably to use `mach try` with a specific set of test
>> directories. It will generate a set of --try-test-paths flags to restrict
>> tests to those paths, and only run the first chunk of any group. Without
>> that, groups shift around too much to be reliable.
>>
>> On Fri, Sep 15, 2017 at 10:03:00AM +0900, Masayuki Nakano wrote:
>>>
>>> Even when I got the chunk numbers, specifying chunk numbers of mochitests
>>> wouldn't work, see this log:
>>>
>>> https://treeherder.mozilla.org/#/jobs?repo=try=c09c7046ed0664e89f7224e1de5219c39c94c948
>>> After that, I needed to rerun mochitests with |-u mochitests|. IIRC, I
>>> tried to kick the specific chunks with "Add new jobs", but didn't work.
>>> And also, when I try to investigate random oranges which are not
>>> reproducible on my environments, I want an option like |--run-until-failure|
>>> and |--repeat REPEAT| in the try syntax. Because of no such options, I need
>>> to trigger a lot of jobs manually and that may/might cause too many oranges.
>>>
>>> On 9/15/2017 1:21 AM, Kyle Lahnakoski wrote:


 You can try ActiveData, which stores all test results from the past few
 weeks.  Here is an example query that shows the chunk number for each
 run/build combo in the past day.  ActiveData is sometimes more than a
 day behind

 https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu

 {
 "from":"unittest",
 "select":[
 {"aggregate":"count"},
 {"value":"action.start_time","aggregate":"max"}
 ],
 "groupby":[
 "run.suite",
 "run.chunk",
 "result.test",
 "build.platform",
 "build.type",
 "run.type"
 ],
 "where":{"and":[
 {"eq":{"build.branch":"mozilla-inbound"}},
 {"prefix":{"run.suite":"moch"}},
 {"gt":{"action.start_time":{"date":"today-day"}}},
 {"regex":{"result.test":".*browser_623779.js.*"}}
 ]},
 "limit":1000
 }



 On 2017-09-14 11:49, Michael de Boer wrote:
>>
>> On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:
>>
>> When I need to retrigger a mochitest-browser test multiple times (to
>> investigate an intermittent), often I end up running all the
>> mochitest-browser tests, looking at every log until I find the chunk
>> where the test is, and retrigger just that chunk. The chunk number
>> changes based on the platform and debug/opt, so it's painful.
>> Is there a way to trigger only the chunk that will contain a given
>> test, so I can save running all of the other chunks?
>
> This! This! This! I’d love to be able to do this - would making testing
> possible test failure fixes sooo much easier.
>
> Cheers,
>
> Mike.
>

>>>
>>>
>>> --
>>> Masayuki Nakano 
>>> Software Engineer, Mozilla
>>
>>
>
>
> --
> Masayuki Nakano 
> Software Engineer, Mozilla
>
> --
> You received this message because you are subscribed to the Google Groups
> "firefox-ci" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to firefox-ci+unsubscr...@mozilla.com.
> To post to this group, send email to firefox...@mozilla.com.
> To view this discussion on the web visit
> https://groups.google.com/a/mozilla.com/d/msgid/firefox-ci/866a0e06-fbd9-c99b-451e-e20f80a12759%40mozilla.com.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread Masayuki Nakano
Even when I got the chunk numbers, specifying chunk numbers of 
mochitests wouldn't work, see this log:

https://treeherder.mozilla.org/#/jobs?repo=try=c09c7046ed0664e89f7224e1de5219c39c94c948
After that, I needed to rerun mochitests with |-u mochitests|. IIRC, I 
tried to kick the specific chunks with "Add new jobs", but didn't work.
And also, when I try to investigate random oranges which are not 
reproducible on my environments, I want an option like 
|--run-until-failure| and |--repeat REPEAT| in the try syntax. Because 
of no such options, I need to trigger a lot of jobs manually and that 
may/might cause too many oranges.


On 9/15/2017 1:21 AM, Kyle Lahnakoski wrote:


You can try ActiveData, which stores all test results from the past few
weeks.  Here is an example query that shows the chunk number for each
run/build combo in the past day.  ActiveData is sometimes more than a
day behind

https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu

{
     "from":"unittest",
     "select":[
         {"aggregate":"count"},
         {"value":"action.start_time","aggregate":"max"}
     ],
     "groupby":[
         "run.suite",
         "run.chunk",
         "result.test",
         "build.platform",
         "build.type",
         "run.type"
     ],
     "where":{"and":[
         {"eq":{"build.branch":"mozilla-inbound"}},
         {"prefix":{"run.suite":"moch"}},
         {"gt":{"action.start_time":{"date":"today-day"}}},
         {"regex":{"result.test":".*browser_623779.js.*"}}
     ]},
     "limit":1000
}



On 2017-09-14 11:49, Michael de Boer wrote:

On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:

When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?

This! This! This! I’d love to be able to do this - would making testing 
possible test failure fixes sooo much easier.

Cheers,

Mike.






--
Masayuki Nakano 
Software Engineer, Mozilla
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread Masayuki Nakano
I tried to say different point. See the treehearder log, mochitests 
didn't run except on Win7 Debug, Android 4.3 API16+ Opt/Debug. So, try 
syntax parser or something is really broken. I often meet this kind of bug.


On 9/15/2017 10:07 AM, Kris Maglione wrote:
Your best bet is probably to use `mach try` with a specific set of test 
directories. It will generate a set of --try-test-paths flags to 
restrict tests to those paths, and only run the first chunk of any 
group. Without that, groups shift around too much to be reliable.


On Fri, Sep 15, 2017 at 10:03:00AM +0900, Masayuki Nakano wrote:
Even when I got the chunk numbers, specifying chunk numbers of 
mochitests wouldn't work, see this log:
https://treeherder.mozilla.org/#/jobs?repo=try=c09c7046ed0664e89f7224e1de5219c39c94c948 

After that, I needed to rerun mochitests with |-u mochitests|. IIRC, I 
tried to kick the specific chunks with "Add new jobs", but didn't work.
And also, when I try to investigate random oranges which are not 
reproducible on my environments, I want an option like 
|--run-until-failure| and |--repeat REPEAT| in the try syntax. Because 
of no such options, I need to trigger a lot of jobs manually and that 
may/might cause too many oranges.


On 9/15/2017 1:21 AM, Kyle Lahnakoski wrote:


You can try ActiveData, which stores all test results from the past few
weeks.  Here is an example query that shows the chunk number for each
run/build combo in the past day.  ActiveData is sometimes more than a
day behind

https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu

{
    "from":"unittest",
    "select":[
        {"aggregate":"count"},
        {"value":"action.start_time","aggregate":"max"}
    ],
    "groupby":[
        "run.suite",
        "run.chunk",
        "result.test",
        "build.platform",
        "build.type",
        "run.type"
    ],
    "where":{"and":[
        {"eq":{"build.branch":"mozilla-inbound"}},
        {"prefix":{"run.suite":"moch"}},
        {"gt":{"action.start_time":{"date":"today-day"}}},
        {"regex":{"result.test":".*browser_623779.js.*"}}
    ]},
    "limit":1000
}



On 2017-09-14 11:49, Michael de Boer wrote:

On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:

When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?
This! This! This! I’d love to be able to do this - would making 
testing possible test failure fixes sooo much easier.


Cheers,

Mike.






--
Masayuki Nakano 
Software Engineer, Mozilla





--
Masayuki Nakano 
Software Engineer, Mozilla
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-15 Thread James Graham

On 15/09/17 00:53, Dustin Mitchell wrote:

2017-09-14 18:32 GMT-04:00 Botond Ballo :

I think "-p all" is still useful for "T pushes" (and it sounds like
build jobs aren't the main concern resource-wise).


Correct -- all builds are in AWS.

I'd like to steer this away from "What legacy syntax should we use
instead?" and "How should we tweak the legacy try syntax?" to:

  How can we use the modern tryselect functionality to achieve more
precise try pushes?


I think that's a good discussion to have, but the original motivation 
for this thread aiui are recent incidents where there have been 12+ hour 
backclogs on try, causing problems across the org. In general we ought 
to solve this by being smarter about what's run automatically, but we 
aren't there yet. We also don't have full uptake of |mach try fuzzy| and 
in any case, people likely to be impacted by this all know try syntax. 
So a discussion in those terms seems meaningful.


I think there are some fairly simple rules people can apply to help with 
the observed, recurring, problem. These are not official, I'm not in a 
position of authority here, but I assume people will correct anything 
that's wrong or controversial:


* -p all is generally OK because builds are on cloud machines and we 
aren't hardware constrained there. Obviously any unnecessary job, 
including builds, does cost money.


* Bare -p all -u all generally isn't OK. In particular it shouldn't be 
seen as the default "check before landing" try push. Of course, if you 
have a large cross-cutting change that genuinely could affect any test 
on any platform, it might be the right choice.


* A combination of selecting specific relevant suites and representative 
platforms using -u [platform] is generally a good choice. 
|mach try fuzzy| is a better way to schedule this kind of push.


*  mach try allows specifying specific paths or directories. This allows 
even finer grained test selection where you are interested in specific 
tests.


* In general running tests on mac should be avoided if possible. This is 
our most hardware constrained regression test platform. People only 
using it when they think that their change will affect mac differently 
to linux and windows will help a lot.


* If you know your try push failed before all jobs complete, or you land 
a patch with jobs still pending, please take a moment to cancel all 
pending jobs from treeherder. That is disproportionately helpful for 
freeing up resources on backlogged platforms.


* I have no idea about performance tests.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Kris Maglione
Your best bet is probably to use `mach try` with a specific set 
of test directories. It will generate a set of --try-test-paths 
flags to restrict tests to those paths, and only run the first 
chunk of any group. Without that, groups shift around too much 
to be reliable.


On Fri, Sep 15, 2017 at 10:03:00AM +0900, Masayuki Nakano wrote:
Even when I got the chunk numbers, specifying chunk numbers of 
mochitests wouldn't work, see this log:

https://treeherder.mozilla.org/#/jobs?repo=try=c09c7046ed0664e89f7224e1de5219c39c94c948
After that, I needed to rerun mochitests with |-u mochitests|. IIRC, I 
tried to kick the specific chunks with "Add new jobs", but didn't 
work.
And also, when I try to investigate random oranges which are not 
reproducible on my environments, I want an option like 
|--run-until-failure| and |--repeat REPEAT| in the try syntax. Because 
of no such options, I need to trigger a lot of jobs manually and that 
may/might cause too many oranges.


On 9/15/2017 1:21 AM, Kyle Lahnakoski wrote:


You can try ActiveData, which stores all test results from the past few
weeks.  Here is an example query that shows the chunk number for each
run/build combo in the past day.  ActiveData is sometimes more than a
day behind

https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu

{
    "from":"unittest",
    "select":[
        {"aggregate":"count"},
        {"value":"action.start_time","aggregate":"max"}
    ],
    "groupby":[
        "run.suite",
        "run.chunk",
        "result.test",
        "build.platform",
        "build.type",
        "run.type"
    ],
    "where":{"and":[
        {"eq":{"build.branch":"mozilla-inbound"}},
        {"prefix":{"run.suite":"moch"}},
        {"gt":{"action.start_time":{"date":"today-day"}}},
        {"regex":{"result.test":".*browser_623779.js.*"}}
    ]},
    "limit":1000
}



On 2017-09-14 11:49, Michael de Boer wrote:

On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:

When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?

This! This! This! I’d love to be able to do this - would making testing 
possible test failure fixes sooo much easier.

Cheers,

Mike.






--
Masayuki Nakano 
Software Engineer, Mozilla


--
Kris Maglione
Senior Firefox Add-ons Engineer
Mozilla Corporation

The presence of those seeking the truth is infinitely to be preferred
to the presence of those who think they’ve found it.
--Terry Pratchett

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Dustin Mitchell
2017-09-14 18:32 GMT-04:00 Botond Ballo :
> I think "-p all" is still useful for "T pushes" (and it sounds like
> build jobs aren't the main concern resource-wise).

Correct -- all builds are in AWS.

I'd like to steer this away from "What legacy syntax should we use
instead?" and "How should we tweak the legacy try syntax?" to:

 How can we use the modern tryselect functionality to achieve more
precise try pushes?

(tryselect is the task-selection logic behind ./mach try fuzzy)

Dustin
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Botond Ballo
On Thu, Sep 14, 2017 at 4:54 PM, Mike Hommey  wrote:
> Maybe it's time to kill the `all` flag, at least for -p. Why? For the
> combined reason that you're saying we shouldn't be using it, and that
> it's actually *not* running every platform.

I think "-p all" is still useful for "T pushes" (and it sounds like
build jobs aren't the main concern resource-wise).

Cheers,
Botond
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Mike Hommey
On Thu, Sep 14, 2017 at 11:35:53AM -0400, Stuart Philp wrote:
> Hello all,
> 
> As we near 57 the Firefox CI group felt it was important to send out a bit
> of a reminder regarding infrastructure usage when you push.
> 
> *tl;dr* There is a real cost (both time and $) to using the 'all' flags in
> pushes. They are there if you need them, but please remember to think about
> what platforms and test suites you need to execute before you push, and
> limit the scope of execution if you can.

Maybe it's time to kill the `all` flag, at least for -p. Why? For the
combined reason that you're saying we shouldn't be using it, and that
it's actually *not* running every platform.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Kyle Lahnakoski

You can try ActiveData, which stores all test results from the past few
weeks.  Here is an example query that shows the chunk number for each
run/build combo in the past day.  ActiveData is sometimes more than a
day behind

https://activedata.allizom.org/tools/query.html#query_id=4HHuBgDu

{
    "from":"unittest",
    "select":[
        {"aggregate":"count"},
        {"value":"action.start_time","aggregate":"max"}
    ],
    "groupby":[
        "run.suite",
        "run.chunk",
        "result.test",
        "build.platform",
        "build.type",
        "run.type"
    ],
    "where":{"and":[
        {"eq":{"build.branch":"mozilla-inbound"}},
        {"prefix":{"run.suite":"moch"}},
        {"gt":{"action.start_time":{"date":"today-day"}}},
        {"regex":{"result.test":".*browser_623779.js.*"}}
    ]},
    "limit":1000
}



On 2017-09-14 11:49, Michael de Boer wrote:
>> On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:
>>
>> When I need to retrigger a mochitest-browser test multiple times (to
>> investigate an intermittent), often I end up running all the
>> mochitest-browser tests, looking at every log until I find the chunk
>> where the test is, and retrigger just that chunk. The chunk number
>> changes based on the platform and debug/opt, so it's painful.
>> Is there a way to trigger only the chunk that will contain a given
>> test, so I can save running all of the other chunks?
> This! This! This! I’d love to be able to do this - would making testing 
> possible test failure fixes sooo much easier.
>
> Cheers,
>
> Mike.
>

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Marco Bonardo
On Thu, Sep 14, 2017 at 6:13 PM, Andrew Halberstadt
 wrote:
> Yes, all mochitests except Android restart between manifests (which is
> usually the same as folders).

Thank you very much, that's really useful to know and will allow me to
save some infra time!
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Andrew Halberstadt
Yes, all mochitests except Android restart between manifests (which is
usually the same as folders).

On Thu, Sep 14, 2017 at 12:03 PM Marco Bonardo  wrote:

> On Thu, Sep 14, 2017 at 5:56 PM, James Graham 
> wrote:
> > On 14/09/17 16:48, Marco Bonardo wrote:
> > mach try -p linux64 
>
> Afaict, that runs a single folder, but the intermittent may be caused
> by interactions across different tests in different folders. I'm not
> up-to-date to what we do today, do we restart the test harness per
> each folder? That'd basically solve my troubles.
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Cameron Dawson
That’s correct, yeah.  If you don’t have a push where it’s failed already, then 
it won’t show in the Test Centric UI.  Though I’ll write up a bug to explore 
adding this functionality.  Perhaps there’s a way to mine Active Data to get 
this.

-Cam

> On Sep 14, 2017, at 9:05 AM, Gijs Kruitbosch  wrote:
> 
> This only works once you have a run that failed the test you're interested 
> in, right? There's no way to tell the test-centric UI "find me the chunk for 
> test with name X".
> 
> ~ Gijs
> 
> On 14/09/2017 16:55, Cameron Dawson wrote:
>> Marco—  I don’t know of a way to do exactly that yet.  But that is in the 
>> roadmap for the Test-based UI in Treeherder.  And the existing UI may help 
>> you there.
>> 
>> On any push, click the down arrow (Action Menu) at the far right of the push 
>> status line and select “Experimental: Test-Centric UI”
>> From there you can see the list of tests that failed for that push (at this 
>> time, only for tests that log with the structured logging, but they include 
>> Mochitest)
>> For each test, you’ll see a link to the chunk back in Treeherder where that 
>> test ran.  So you can go BACK to Treeherder to do your retrigger there.  
>> This side-UI will be moving back into the main Treeherder repo soon, so 
>> you’ll be able to trigger directly from there at some point.
>> 
>> I realize this workflow is a but cumbersome, but perhaps better than poring 
>> through logs.  :)
>> 
>> I’m actively working on this UI, so please give me any feedback you have in 
>> the form of bugs or in #treeherder.
>> 
>> -Cam
>> 
>> 
>>> On Sep 14, 2017, at 8:48 AM, Marco Bonardo  wrote:
>>> 
>>> When I need to retrigger a mochitest-browser test multiple times (to
>>> investigate an intermittent), often I end up running all the
>>> mochitest-browser tests, looking at every log until I find the chunk
>>> where the test is, and retrigger just that chunk. The chunk number
>>> changes based on the platform and debug/opt, so it's painful.
>>> Is there a way to trigger only the chunk that will contain a given
>>> test, so I can save running all of the other chunks?
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "firefox-ci" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to firefox-ci+unsubscr...@mozilla.com.
>>> To post to this group, send email to firefox...@mozilla.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/a/mozilla.com/d/msgid/firefox-ci/CAPDqYT151ETZSGM83Wo_jdpSj1bHhs57eTpah4bE5PE2BM9ckQ%40mail.gmail.com.
> 
> 

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Gijs Kruitbosch
This only works once you have a run that failed the test you're 
interested in, right? There's no way to tell the test-centric UI "find 
me the chunk for test with name X".


~ Gijs

On 14/09/2017 16:55, Cameron Dawson wrote:

Marco—  I don’t know of a way to do exactly that yet.  But that is in the 
roadmap for the Test-based UI in Treeherder.  And the existing UI may help you 
there.

On any push, click the down arrow (Action Menu) at the far right of the push 
status line and select “Experimental: Test-Centric UI”
 From there you can see the list of tests that failed for that push (at this 
time, only for tests that log with the structured logging, but they include 
Mochitest)
For each test, you’ll see a link to the chunk back in Treeherder where that 
test ran.  So you can go BACK to Treeherder to do your retrigger there.  This 
side-UI will be moving back into the main Treeherder repo soon, so you’ll be 
able to trigger directly from there at some point.

I realize this workflow is a but cumbersome, but perhaps better than poring 
through logs.  :)

I’m actively working on this UI, so please give me any feedback you have in the 
form of bugs or in #treeherder.

-Cam



On Sep 14, 2017, at 8:48 AM, Marco Bonardo  wrote:

When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?

--
You received this message because you are subscribed to the Google Groups 
"firefox-ci" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to firefox-ci+unsubscr...@mozilla.com.
To post to this group, send email to firefox...@mozilla.com.
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.com/d/msgid/firefox-ci/CAPDqYT151ETZSGM83Wo_jdpSj1bHhs57eTpah4bE5PE2BM9ckQ%40mail.gmail.com.



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Andrew Halberstadt
There's sort of a way to do this with try syntax. I say sort of because it
doesn't support all suites and there seems to be a few bugs with it. But
you can try:

./mach try -b o -p linux64 -u none path/to/dir/or/test

This should only run the directory or test you specified (it'll always show
up as chunk 1). I have vague plans to implement this a bit more robustly
for try_task_config.json based scheduling, but no time frame on when that
work might happen yet.

-Andrew

On Thu, Sep 14, 2017 at 11:48 AM Marco Bonardo  wrote:

> When I need to retrigger a mochitest-browser test multiple times (to
> investigate an intermittent), often I end up running all the
> mochitest-browser tests, looking at every log until I find the chunk
> where the test is, and retrigger just that chunk. The chunk number
> changes based on the platform and debug/opt, so it's painful.
> Is there a way to trigger only the chunk that will contain a given
> test, so I can save running all of the other chunks?
>
> --
> You received this message because you are subscribed to the Google Groups
> "firefox-ci" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to firefox-ci+unsubscr...@mozilla.com.
> To post to this group, send email to firefox...@mozilla.com.
> To view this discussion on the web visit
> https://groups.google.com/a/mozilla.com/d/msgid/firefox-ci/CAPDqYT151ETZSGM83Wo_jdpSj1bHhs57eTpah4bE5PE2BM9ckQ%40mail.gmail.com
> .
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Marco Bonardo
On Thu, Sep 14, 2017 at 5:56 PM, James Graham  wrote:
> On 14/09/17 16:48, Marco Bonardo wrote:
> mach try -p linux64 

Afaict, that runs a single folder, but the intermittent may be caused
by interactions across different tests in different folders. I'm not
up-to-date to what we do today, do we restart the test harness per
each folder? That'd basically solve my troubles.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread James Graham

On 14/09/17 16:48, Marco Bonardo wrote:

When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?


You might be able to use

mach try -p linux64 

in order to run a single chunk with just the chosen tests.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Cameron Dawson
Marco—  I don’t know of a way to do exactly that yet.  But that is in the 
roadmap for the Test-based UI in Treeherder.  And the existing UI may help you 
there.

On any push, click the down arrow (Action Menu) at the far right of the push 
status line and select “Experimental: Test-Centric UI”
From there you can see the list of tests that failed for that push (at this 
time, only for tests that log with the structured logging, but they include 
Mochitest)
For each test, you’ll see a link to the chunk back in Treeherder where that 
test ran.  So you can go BACK to Treeherder to do your retrigger there.  This 
side-UI will be moving back into the main Treeherder repo soon, so you’ll be 
able to trigger directly from there at some point.

I realize this workflow is a but cumbersome, but perhaps better than poring 
through logs.  :)

I’m actively working on this UI, so please give me any feedback you have in the 
form of bugs or in #treeherder.

-Cam


> On Sep 14, 2017, at 8:48 AM, Marco Bonardo  wrote:
> 
> When I need to retrigger a mochitest-browser test multiple times (to
> investigate an intermittent), often I end up running all the
> mochitest-browser tests, looking at every log until I find the chunk
> where the test is, and retrigger just that chunk. The chunk number
> changes based on the platform and debug/opt, so it's painful.
> Is there a way to trigger only the chunk that will contain a given
> test, so I can save running all of the other chunks?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "firefox-ci" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to firefox-ci+unsubscr...@mozilla.com.
> To post to this group, send email to firefox...@mozilla.com.
> To view this discussion on the web visit 
> https://groups.google.com/a/mozilla.com/d/msgid/firefox-ci/CAPDqYT151ETZSGM83Wo_jdpSj1bHhs57eTpah4bE5PE2BM9ckQ%40mail.gmail.com.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Michael de Boer

> On 14 Sep 2017, at 17:48, Marco Bonardo  wrote:
> 
> When I need to retrigger a mochitest-browser test multiple times (to
> investigate an intermittent), often I end up running all the
> mochitest-browser tests, looking at every log until I find the chunk
> where the test is, and retrigger just that chunk. The chunk number
> changes based on the platform and debug/opt, so it's painful.
> Is there a way to trigger only the chunk that will contain a given
> test, so I can save running all of the other chunks?

This! This! This! I’d love to be able to do this - would making testing 
possible test failure fixes sooo much easier.

Cheers,

Mike.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reminder on Try usage and infrastructure resources

2017-09-14 Thread Marco Bonardo
When I need to retrigger a mochitest-browser test multiple times (to
investigate an intermittent), often I end up running all the
mochitest-browser tests, looking at every log until I find the chunk
where the test is, and retrigger just that chunk. The chunk number
changes based on the platform and debug/opt, so it's painful.
Is there a way to trigger only the chunk that will contain a given
test, so I can save running all of the other chunks?
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Reminder on Try usage and infrastructure resources

2017-09-14 Thread Stuart Philp
Hello all,

As we near 57 the Firefox CI group felt it was important to send out a bit
of a reminder regarding infrastructure usage when you push.

*tl;dr* There is a real cost (both time and $) to using the 'all' flags in
pushes. They are there if you need them, but please remember to think about
what platforms and test suites you need to execute before you push, and
limit the scope of execution if you can.

A bit of background, our build and test infrastructure is a mix of physical
hardware and AWS cloud instances. AWS scales dynamically to our load, but
our physical hardware is limited. Occasionally you might see wait times and
queues build up, this is typically due to our hardware being overwhelmed.
When it gets really bad, we sometimes have to close the trees to allow the
machines to catch up. Obviously, that's not good for anyone. Specifically,
over the last few weeks we have seen a few long backlogs on our OSX
machines, once requiring tree closure. We never want to have to close
trees, it's a last resort, especially this close to beta.

Because of the physical hardware limitation, this is particularly
concerning for performance tests and tests that run on OSX (OSX builds are
now cross-compiled on Linux and not really affected). If you don't need to
run perf or OSX tests, please consider excluding them from your pushes.
ahal sent mail a few weeks ago about the new fuzzy
 matching tool, which can be
useful here to help you figure out what to select.

To give you an idea of scale, we average 1000 pushes per week on
integration branches (excluding try). Our desktop tests alone (excluding
numbers for android, build jobs, and a handful of others) use roughly 900
machine hours per push. 900k machine hours per week combined. Including try
and those other configurations you can roughly double these numbers.
Needless to say that's a lot of machine time, and so any savings we can get
can really add up.

We are continuously monitoring our capacity requirements for today and for
the future (new platforms, updated OSes, new experiments, new tests, etc).
But it's a dynamic problem, and sometimes things pile up. While we accept
that today, it's a problem we want to further limit in the future. There
are a lot of interesting things we're working on here, such as selective
test execution, intermittent reduction strategies, smarter tooling, and
smarter infrastructure allocation that will hopefully go a long way to
reducing these issues. We'll continue to update everyone here as we make
those improvements.

In the mean time, just a reminder to be diligent with what platforms and
test suites you are running.

If you have any questions feel free to reach out.

Thanks!
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform