On Wed, Feb 24, 2021 at 2:36 PM Marcin Sobczyk <[email protected]> wrote: > > > > On 2/24/21 11:05 AM, Yedidyah Bar David wrote: > > On Wed, Feb 24, 2021 at 11:43 AM Milan Zamazal <[email protected]> wrote: > >> Yedidyah Bar David <[email protected]> writes: > >> > >>> Hi all, > >>> > >>> Right now, when we merge a patch e.g. to the engine (and many other > >>> projects), it can take up to several days until it is used by the > >>> hosted-engine ovirt-system-tests suite. Something similar will happen > >>> soon if/when we introduce suites that use ovirt-node. > >>> > >>> If I got it right: > >>> - Merge causes CI to build the engine - immediately, takes ~ 1 hour (say) > >>> - A publisher job [1] publishes it to resources.ovirt.org (daily, > >>> midnight (UTC)) > >>> - The next run of an appliance build [2] includes it (daily, afternoon) > >>> - The next run of the publisher [1] publishes the appliance (daily, > >>> midnight) > >>> - The next run of ost-images [3] includes the appliance (daily, > >>> midnight, 2 hours after the publisher) (and publishes it immediately) > >>> - The next run of ost (e.g. [4]) will use it (daily, slightly *before* > >>> ost-images, but I guess we can change that. And this does not affect > >>> manual runs of OST, so can probably be ignored in the calculation, at > >>> least to some extent). > >>> > >>> So if I got it right, a patch merged to the engine in some morning, > >>> will be used by the nightly run of OST HE only after almost 3 days, > >>> and available for manual runs after 2 days. IMO that's too much time. > >>> I might be somewhat wrong, but not very, I think. > >>> > >>> One partial solution is to add automation .repos lines to relevant > >>> projects that will link at lastSuccessfulBuild (let's call it lastSB) > >>> of the more important projects they consume - e.g. appliance to use > >>> lastSB of engine+dwh+a few others, node to use lastSB of vdsm, etc. > >>> This will require more maintenance (adding/removing/fixing projects as > >>> needed) and cause some more load on CI (as now packages will be > >>> downloaded from it instead of from resources.ovirt.org). > >>> > >>> Another solution is to run relevant jobs (publisher/appliance/node) > >>> far more often - say, once every two hours. > >> One important thing to consider is an ability to run OST on our patches > >> at all. If there is (almost) always a newer build available then custom > >> repos added to OST runs, whether on Jenkins or locally, will be ignored > >> and we'll be unable to test our patches before they are merged. > > Indeed. That's an important point. IIRC OST has a ticket specifically > > addressing this issue. > Yes, we have: > > https://gerrit.ovirt.org/#/c/ovirt-system-tests/+/113223/ > > and: > > https://issues.redhat.com/browse/RHV-41025 > > which is not implemented yet. > > The downside of upgrading to the latest RPMs from 'tested' repo is, as > Milan mentioned, > an increased chance that your own packages will not be used cause > they're too old. > The upside is that if someone breaks OST globally with i.e. some engine > patch, > and a fix for the problem is merged midday, upgrading to the latest RPMs > will unblock the runs. > If we don't upgrade, we'll have to wait for the nightly job to rebuild > ost-images to include the fix. > Rebuilding ost-images midday is an option, but it takes a lot of time, > so in most cases > one can simply wait till tomorrow... > > I want to fix this by implementing an option in OST's manual run > (switched off by default) > that will allow you to upgrade to the latest RPMs from 'tested'. That > way one has ~24h > for his/her patches to be fresh enough to be picked up by dnf. > > 'check-patch' jobs should always use latest RPMs from 'tested' IMO. > > > > >>> This will also add load, and might cause "perceived" instability - as > >>> things will likely fluctuate between green and red more often. > >> This doesn't sound very good, I perceive the things less than stable > >> already now. > > Agreed. > > I quoted "perceived" because I do not think they'll actually be less stable. > > Right now, when something critical is broken, we fix it, then manually > > run some of the above jobs as needed, to quickly get back to business. > > When we don't (often), some things simply remain broken for two days. > > > > Running more often will simply notify us about breakage faster. If we > > then fix, it will automatically propage the fix faster. > Isn't upgrading the engine RPM on the appliance an option?
You mean, as part of the OST run itself? Generally speaking, 'hosted-engine --deploy' already does that, but in practice this does not work in CI. I didn't check recently why. Probably some configuration (repos) or missing proxy or something like that. It's done in a task called 'Update all packages' (something to search for in logs if you feel like it). It can be controlled from the CLI with he_offline_deployment [1], but I do not see anywhere that we use this in CI. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1851860 > > > > >>> I think I prefer the latter. What do you think? > >> Wouldn't it be possible to run the whole pipeline nightly (even if it > >> means e.g. running the publisher twice during the night)? > > It will. But this will only fix the specific issue of appliance/node. > > Running more often also simply gives feedback faster. > > > > But I agree that perhaps we should wait with this until OST allows > > using a custom repo reliably and easily. > > > > Thanks, > -- Didi _______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/VKCVRMSMTX6B4UEFYWWFGN54JUD6K5KH/
