We had this structure in the past and the community was bothered by CI
taking more time, thus we moved to the current model with everything
parallelized. We'd basically revert that then.

Can you show by how much the duration will increase?

Also, we have zero test parallelisation, speak we are running one test on
72 core machines (although multiple workers). Wouldn't it be way more
efficient to add parallelisation and thus heavily reduce the time spent on
the tasks instead of staggering?

I feel concerned that these measures to save cost are paid in the form of a
worse user experience. I see a big potential to save costs by increasing
efficiency while actually improving the user experience due to CI being


Joe Evans <joseph.ev...@gmail.com> schrieb am Mi., 25. März 2020, 04:58:

> Hi,
> First, I just wanted to introduce myself to the MXNet community. I’m Joe
> and will be working with Chai and the AWS team to improve some issues
> around MXNet CI. One of our goals is to reduce the costs associated with
> running MXNet CI. The task I’m working on now is this issue:
> https://github.com/apache/incubator-mxnet/issues/17802
> Proposal: Staggered Jenkins CI pipeline
> Based on data collected from Jenkins, around 55% of the time when the
> mxnet-validation CI build is triggered by a PR, either the sanity or
> unix-cpu builds fail. When either of these builds fail, it doesn’t make
> sense to run the rest of the pipelines and utilize all those resources if
> we’ve already identified a build or unit test failure.
> We are proposing changing the MXNet Jenkins CI pipeline by requiring the
> *sanity* and *unix-cpu* builds to complete and pass tests successfully
> before starting the other build pipelines (centos-cpu/gpu, unix-gpu,
> windows-cpu/gpu, etc.) Once the sanity builds successfully complete, the
> remaining build pipelines will be triggered and run in parallel (as they
> currently do.) The purpose of this change is to identify faulty code or
> compatibility issues early and prevent further execution of CI builds. This
> will increase the time required to test a PR, but will prevent unnecessary
> builds from running.
> Does anyone have any concerns with this change or suggestions?
> Thanks.
> Joe Evans
> joseph.ev...@gmail.com

Reply via email to