Hi Denis, pipeline may be the wrong word, job may be the correct one. For example, commiters can currently access a job page like http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-17521/5/ , press "Login" and then the restart button to only retrigger that job, obtaining http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-17521/6/
This is correctly reported to Github and the status will change from failed to passed once depending on the result of the new job. Best regards Leonard On Wed, 2020-02-12 at 20:23 +0000, Davydenko, Denis wrote: > This might or might not work given that GH PR is failed or not given overall > CI run status, not just few builds from it. But it is a good suggestion to try > out, we will evaluate whether it could be accomplished. Thanks! > > > > On 2/12/20, 11:05 AM, "Lausen, Leonard" <lau...@amazon.com.INVALID> wrote: > > Thank you Denis for taking up this initiative. With respect to "Introduce > per-PR > CI bot" and the "[mxnet-ci] run" command. Would it make sense to add > "retriggering only failed pipelines" to the scope? For example users could > be > asked to specify the name of the pipeline, or have "[mxnet-ci] run all" > and > "[mxnet-ci] run failed". > > In the current state, when retriggering all pipelines, it's likely that > one of > them will fail. Only by retriggering the failed pipeline alone there is a > higher > chance to arrive at a state where all pipelines have succeeded. > > On Wed, 2020-02-12 at 10:12 -0800, Davydenko, Denis wrote: > > Hello, MXNet dev community, > > As you all know, the experience with CI infrastructure isn’t ideal in > spite of > > its high cost. For this reason, we’re proposing the following changes to > > improve stability, reduce cost, and grant more control to contributors. > As we > > work in a refresh of CI, we believe these changes will reduce the pain > we all > > suffer when we try to push a PR through the system. > > > > Following is the list of changes: > > Fix missing status reports between GH and Jenkins > > Update Jenkins permission groups to re-trigger builds > > Introduce per-PR CI bot > > Details: > > > > - Fix missing status reports > > Currently, once commit gets added to PR - the CI is run on that added > commit. > > Sometimes, CI run status is missing from the commit in Github despite > having > > completed in Jenkins. Example: CI run: > > > http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-17376/17/pipeline > > , commit status in github (missing unix-cpu, unix-gpu and windows-gpu > > statuses): > > > https://github.com/apache/incubator-mxnet/pull/17376#partial-pull-merging. > > Problem: There seems to be a bug where some status reports are missing > on > > Github. The hypothesis is that there is some issue with Github Hooks. > > > > - Update Jenkins permission groups to re-trigger builds > > Problem: Currently, only MXNet Committers and selected people from AWS > have > > the ability to re-trigger CI runs on PRs. This leaves the PR Authors > waiting > > for authorized users to re-trigger their PRs for them. > > Solution : Allow these membership categories Jenkins Admins, MXNet > Committers, > > and PR Authors to re-trigger PR builds. > > > > - Introduce per-PR CI bot > > Problem: As of date, MXNet CI is automated. It runs every time a commit > is > > pushed onto your Github PR. This results in lot of unnecessary CI runs > apart > > from added costs. > > Solution: Switch to Manual Trigger. Users from authorized groups (1 of > the 3 > > categories mentioned above) can trigger CI run by adding a simple > comment to > > PR: “[mxnet-ci] run”. > > > > -- > > Thank you, > > > > AWS MXNet team > > > > > > > >