Adding Evgeny and Shirly who are AFAIK the owners of the metrics suit.

On Sun, 1 Sep 2019 at 17:07, Barak Korren <[email protected]> wrote:

> If you have been using or monitoring any OST suits recently, you may have
> noticed we've been suffering from long delays in allocating CI hardware
> resources for running OST suits. I'd like to briefly discuss the reasons
> behind this, what are planning to do to resolve this and the implication of
> those actions for big suit owners.
>
> As you might know, we have moved a while ago from running OST suits each
> on its own dedicated server to running them inside containers managed by
> OpenShift. That had allowed us to run multiple OST suits on the same
> bare-metal host which in turn increased our overall capacity by 50% while
> still allowing us to free up hardware for accommodating the kubevirt
> project on our CI hardware.
>
> Our infrastructure is currently built in a way where we use the exact same
> POD specification (and therefore resource settings) for all suits. Making
> it more flexible at this point would require significant code changes we
> are not likely to make. What this means is that we need to make sure our
> PODs have enough resources to run the most demanding suits. It also means
> we waste some resources when running less demanding ones.
>
> Given the set of OST suits we have ATM, we sized our PODs to allocate
> 32Gibs of RAM. Given the servers we have, this means we can run 15 suits at
> a time in parallel. This was sufficient for a while, but given increasing
> demand, and the expectation for it to increase further once we introduce
> the patch gating features we've been working on, we must find a way to
> significantly increase our suit running capacity.
>
> We have measured the amount of RAM required by each suit and came to the
> conclusion that for the vast majority of suits, we could settle for PODs
> that allocate only 14Gibs of RAM. If we make that change, we would be able
> to run a total of 40 suits at a time, almost tripling our current capacity.
>
> The downside of making this change is that our STDCI V2 infrastructure
> will no longer be able to run suits that require more then 14Gib of RAM.
> This effectively means it would no longer be possible to run these suits
> from OST's check-patch job or from the OST manual job.
>
> The list of relevant suits that would be affected follows, the suit
> owners, as documented in the CI configuration, have be added as "to"
> recipients to the message:
>
>    - hc-basic-suite-4.3
>    - hc-basic-suite-master
>    - metrics-suite-4.3
>
> Since we're aware people would still like to be able to work with the
> bigger suits, we will leverage the nightly suit invocation jobs to enable
> then to be run in the CI infra. We will support the following use cases:
>
>    - *Periodically running the suit on the latest oVirt packages* - this
>    will be done by the nightly job like it is done today
>    - *Running the suit to test changes to the suit`s code* - while
>    currently this is done automatically by check-patch, this would have to be
>    done manually in the future by manually triggering the nightly job and
>    setting the REFSPEC parameter to point to the examined patch
>    - *Triggering the suit manually* - This would be done by triggering
>    the suit-specific nightly job (as opposed to the general OST manual job)
>
>  The patches listed below implement the changes outlined above:
>
>    - 102757 <https://gerrit.ovirt.org/102757> nightly-system-tests: big
>    suits -> big containers
>    - 102771 <https://gerrit.ovirt.org/102771>: stdci: Drop `big` suits
>    from check-patch
>
> We know that making the changes we presented will make things a little
> less convenient for users and maintainers of the big suits, but we believe
> the benefits of having vastly increased execution capacity for all other
> suits outweigh those shortcomings.
>
> We would like to hear all relevant comment and questions from the quite
> owners and other interested parties, especially is you think we should not
> carry out the changes we propose.
> Please take the time to respond on this thread, or on the linked patches.
>
> Thanks,
>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/OURCWMCLA5KU36S5FJUG75KWJA3QAKLU/

Reply via email to