This might or might not work given that GH PR is failed or not given overall CI 
run status, not just few builds from it. But it is a good suggestion to try 
out, we will evaluate whether it could be accomplished. Thanks!



On 2/12/20, 11:05 AM, "Lausen, Leonard" <lau...@amazon.com.INVALID> wrote:

    Thank you Denis for taking up this initiative. With respect to "Introduce 
per-PR 
    CI bot" and the "[mxnet-ci] run" command. Would it make sense to add
    "retriggering only failed pipelines" to the scope? For example users could 
be
    asked to specify the name of the pipeline, or have "[mxnet-ci] run all" and
    "[mxnet-ci] run failed".
    
    In the current state, when retriggering all pipelines, it's likely that one 
of
    them will fail. Only by retriggering the failed pipeline alone there is a 
higher
     chance to arrive at a state where all pipelines have succeeded.
    
    On Wed, 2020-02-12 at 10:12 -0800, Davydenko, Denis wrote:
    > Hello, MXNet dev community,
    > As you all know, the experience with CI infrastructure isn’t ideal in 
spite of
    > its high cost. For this reason, we’re proposing the following changes to
    > improve stability, reduce cost, and grant more control to contributors. 
As we
    > work in a refresh of CI, we believe these changes will reduce the pain we 
all
    > suffer when we try to push a PR through the system.
    > 
    > Following is the list of changes:
    > Fix missing status reports between GH and Jenkins
    > Update Jenkins permission groups to re-trigger builds
    > Introduce per-PR CI bot
    > Details:
    > 
    > - Fix missing status reports
    > Currently, once commit gets added to PR - the CI is run on that added 
commit.
    > Sometimes, CI run status is missing from the commit in Github despite 
having
    > completed in Jenkins. Example: CI run: 
    > 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-17376/17/pipeline
    > , commit status in github (missing unix-cpu, unix-gpu and windows-gpu
    > statuses): 
    > https://github.com/apache/incubator-mxnet/pull/17376#partial-pull-merging.
    > Problem: There seems to be a bug where some status reports are missing on
    > Github. The hypothesis is that there is some issue with Github Hooks.
    > 
    > - Update Jenkins permission groups to re-trigger builds
    > Problem: Currently, only MXNet Committers and selected people from AWS 
have
    > the ability to re-trigger CI runs on PRs. This leaves the PR Authors 
waiting
    > for authorized users to re-trigger their PRs for them.
    > Solution : Allow these membership categories Jenkins Admins, MXNet 
Committers,
    > and PR Authors to re-trigger PR builds.
    > 
    > - Introduce per-PR CI bot
    > Problem: As of date, MXNet CI is automated. It runs every time a commit is
    > pushed onto your Github PR. This results in lot of unnecessary CI runs 
apart
    > from added costs.
    > Solution: Switch to Manual Trigger. Users from authorized groups (1 of 
the 3
    > categories mentioned above) can trigger CI run by adding a simple comment 
to
    > PR: “[mxnet-ci] run”. 
    > 
    > --
    > Thank you,
    > 
    > AWS MXNet team
    > 
    >  
    > 
    

Reply via email to