On Wed, Feb 24, 2021 at 2:36 PM Marcin Sobczyk <[email protected]> wrote:
>
>
>
> On 2/24/21 11:05 AM, Yedidyah Bar David wrote:
> > On Wed, Feb 24, 2021 at 11:43 AM Milan Zamazal <[email protected]> wrote:
> >> Yedidyah Bar David <[email protected]> writes:
> >>
> >>> Hi all,
> >>>
> >>> Right now, when we merge a patch e.g. to the engine (and many other
> >>> projects), it can take up to several days until it is used by the
> >>> hosted-engine ovirt-system-tests suite. Something similar will happen
> >>> soon if/when we introduce suites that use ovirt-node.
> >>>
> >>> If I got it right:
> >>> - Merge causes CI to build the engine - immediately, takes ~ 1 hour (say)
> >>> - A publisher job [1] publishes it to resources.ovirt.org (daily,
> >>> midnight (UTC))
> >>> - The next run of an appliance build [2] includes it (daily, afternoon)
> >>> - The next run of the publisher [1] publishes the appliance (daily, 
> >>> midnight)
> >>> - The next run of ost-images [3] includes the appliance (daily,
> >>> midnight, 2 hours after the publisher) (and publishes it immediately)
> >>> - The next run of ost (e.g. [4]) will use it (daily, slightly *before*
> >>> ost-images, but I guess we can change that. And this does not affect
> >>> manual runs of OST, so can probably be ignored in the calculation, at
> >>> least to some extent).
> >>>
> >>> So if I got it right, a patch merged to the engine in some morning,
> >>> will be used by the nightly run of OST HE only after almost 3 days,
> >>> and available for manual runs after 2 days. IMO that's too much time.
> >>> I might be somewhat wrong, but not very, I think.
> >>>
> >>> One partial solution is to add automation .repos lines to relevant
> >>> projects that will link at lastSuccessfulBuild (let's call it lastSB)
> >>> of the more important projects they consume - e.g. appliance to use
> >>> lastSB of engine+dwh+a few others, node to use lastSB of vdsm, etc.
> >>> This will require more maintenance (adding/removing/fixing projects as
> >>> needed) and cause some more load on CI (as now packages will be
> >>> downloaded from it instead of from resources.ovirt.org).
> >>>
> >>> Another solution is to run relevant jobs (publisher/appliance/node)
> >>> far more often - say, once every two hours.
> >> One important thing to consider is an ability to run OST on our patches
> >> at all.  If there is (almost) always a newer build available then custom
> >> repos added to OST runs, whether on Jenkins or locally, will be ignored
> >> and we'll be unable to test our patches before they are merged.
> > Indeed. That's an important point. IIRC OST has a ticket specifically
> > addressing this issue.
> Yes, we have:
>
> https://gerrit.ovirt.org/#/c/ovirt-system-tests/+/113223/
>
> and:
>
> https://issues.redhat.com/browse/RHV-41025
>
> which is not implemented yet.
>
> The downside of upgrading to the latest RPMs from 'tested' repo is, as
> Milan mentioned,
> an increased chance that your own packages will not be used cause
> they're too old.
> The upside is that if someone breaks OST globally with i.e. some engine
> patch,
> and a fix for the problem is merged midday, upgrading to the latest RPMs
> will unblock the runs.
> If we don't upgrade, we'll have to wait for the nightly job to rebuild
> ost-images to include the fix.
> Rebuilding ost-images midday is an option, but it takes a lot of time,
> so in most cases
> one can simply wait till tomorrow...
>
> I want to fix this by implementing an option in OST's manual run
> (switched off by default)
> that will allow you to upgrade to the latest RPMs from 'tested'. That
> way one has ~24h
> for his/her patches to be fresh enough to be picked up by dnf.
>
> 'check-patch' jobs should always use latest RPMs from 'tested' IMO.
>
> >
> >>> This will also add load, and might cause "perceived" instability - as
> >>> things will likely fluctuate between green and red more often.
> >> This doesn't sound very good, I perceive the things less than stable
> >> already now.
> > Agreed.
> > I quoted "perceived" because I do not think they'll actually be less stable.
> > Right now, when something critical is broken, we fix it, then manually
> > run some of the above jobs as needed, to quickly get back to business.
> > When we don't (often), some things simply remain broken for two days.
> >
> > Running more often will simply notify us about breakage faster. If we
> > then fix, it will automatically propage the fix faster.
> Isn't upgrading the engine RPM on the appliance an option?

You mean, as part of the OST run itself?

Generally speaking, 'hosted-engine --deploy' already does that, but
in practice this does not work in CI. I didn't check recently why. Probably
some configuration (repos) or missing proxy or something like that.
It's done in a task called 'Update all packages' (something to search
for in logs if you feel like it).

It can be controlled from the CLI with he_offline_deployment [1],
but I do not see anywhere that we use this in CI.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1851860

>
> >
> >>> I think I prefer the latter. What do you think?
> >> Wouldn't it be possible to run the whole pipeline nightly (even if it
> >> means e.g. running the publisher twice during the night)?
> > It will. But this will only fix the specific issue of appliance/node.
> > Running more often also simply gives feedback faster.
> >
> > But I agree that perhaps we should wait with this until OST allows
> > using a custom repo reliably and easily.
> >
> > Thanks,
>


-- 
Didi
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/VKCVRMSMTX6B4UEFYWWFGN54JUD6K5KH/

Reply via email to