Re: Enabling running Taskotron tasks on Koji scratch builds

Josef Skladanka Fri, 13 Jan 2017 04:39:07 -0800

I don't have much to add, just that I agree with Kamil. I see some minor
problems even in what he wrote, but that's well beyond what I think we
should be solving now.


j.

On Tue, Jan 10, 2017 at 5:19 PM, Kamil Paral <[email protected]> wrote:

> > Couldn't we use the task ID as the primary identifier but use the srpm
> > for readability sake since there is no "build" NVR for scratch builds?
>
> Systems like new hotness will need to query according to the task ID,
> though (to get reliable results). So we're talking about hacking just
> resultsdb *frontend* here, e.g. by having "item: task_id" and
> "item_display: nvr" in the results yaml. I don't like it much. Searching in
> the frontend would have to search both in item and item_display.
>
> Or we could use our existing "note" field to put NEVR inside, and it would
> be easily visible in the frontend without any ugly hacks. People would have
> to know that if they want to search by NEVR for scratch builds, they need
> to put the NEVR in the note search field (we'd have to add it).
>
> I assume you proposed using task id as primary identifier just to scratch
> builds (but not standard builds). If you also mean it for standard builds,
> then scheduling tasks locally starts to be quite inconvenient (for each
> build you want to run your task on, you need to find its task id first;
> it's hard to re-run something from shell history). We would also be
> changing something that's a de-facto standard in our infra (using NEVR as a
> unique identifier across projects).
>
>
> > Either that or make the primary ID a tuple of (srpm name, koji task
> > start time) either stored that way or rendered as a string e.g
> > foo-1.2-3.fc99.src.rpm built at 201701092100 (YYYYMMDDHHMM) would become
> > foo-1.2-3.fc99-201701092100.
>
> The same problem with inconvenient local execution. The command line
> starts to be hard to read (in tutorials, etc).
>
> Also, the srpm name doesn't have to be anything reasonable, Koji will
> happily accept anything (it doesn't care, for scratch builds). So we can
> easily receive "1.srpm" or "my-new-patch.srpm" from the fedmsg (we actually
> tried this [1]). Deriving any useful info from this input is probably a
> mistake. We would have to download the srpm and look into the included spec
> file to reliably decide which package this is related to.
>
> [1] https://apps.fedoraproject.org/datagrepper/id?id=2017-
> 2fe001d2-32d4-434a-b3a9-29ab31eebbb0&is_raw=true&size=extra-large
>
> >
> > > During a discussion with Kamil, a few solutions were mentioned (none
> > > of them is pretty):
> > >
> > > 1. We can ask koji developers if there is a way to add a method that
> > > would return all koji scratch builds for a given NVR - we would then
> > > take the latest one and work with it.
> >
> > How would we get the NVR if there's no build identifier on scratch
> > builds? Are you talking about the SRPM name in the fedmsg?
>
> Yes. Not from the fedmsg, but from the Koji task (the fedmsg gets this
> value from the srpm filename included in the Koji task, I believe). But see
> the problem described above with srpm file naming (can be arbitrary). So
> even better would be if Koji could search and return scratch build info
> containing the metadata of the srpm or from the spec file included.
>
> But I'm honestly not sure it is a good idea even if they wanted to
> implement it. More on that below.
>
> >
> > > 2. We can use "koji download-task" which works for both. That would
> > > mean koji_build item type would eat koji task IDs instead of NVRs.
> > > This would lead to having koji task IDs in resultsdb instead of NVR's
> > > which kills readability. Unless libtaskotron finds the NVR from the
> > > koji task ID in the case of "real" build" and stores it in a field in
> > > resultsdb...Also, we need NVRs for fedmsgs. So add code to the fedmsg
> > > layer that would take care of somehow adding a NVR to fedmsg of
> > > completed scratch builds tasks...
> >
> > Can't we tell the difference between an NVR and a task ID just by the
> > form that the content takes? Wouldn't the following work:
> >
> >   1. attempt to parse input as NVR, if there are no issues, assume that
> >      it's a build ID
> >   2. if the NVR parse fails, check to see if the input is an int. if
> >      assume that it's a koji task ID
>
> The reverse might be easier, but yes, that's of course possible. There are
> other caveats, though.
>
> >
> > It'd mean that our koji downloading code would get more complicated but
> > it seems like nothing that more unit tests couldn't mitigate. Unless
> > I'm mis-thinking here, the changes needed to pull this off in trigger
> > would be even smaller.
>
> I'm afraid our code could get infested with if clauses pretty soon. This
> is not just about downloading rpms in the koji directive. We have more koji
> build-related functionality and we'll sure get more in the future. For
> example, we currently support "download_latest_stable", but that might be
> very tricky for scratch builds with "1.srpm" files (which package is
> that?). If we ever needed to work with package epoch, it is unavailable for
> scratch builds (since there is no build info object to be returned).
> Another example, we have dist-git directive. Without clear identification
> of the package you can't use dist-git directive. Even if we download 1.srpm
> and look into it, you can still find something that's not available in
> dist-git (because you can build something in Koji that is not packaged in
> Fedora).
>
> What I'm trying to illustrate here is that this is not just a few if
> clauses in a single piece of code (koji utils). This can influence the
> whole workflow through the formula. Some directives might return different
> data based on whether it is a regular build or a scratch build. Some might
> completely fail to work (distgit directive). And you might want to have
> different standards and approaches for scratch builds.
>
> As an example, you might want to use distgit directive to download
> rpmlint.config for that particular package when running rpmlint. If that
> git repo doesn't exist, you want to error out (should never happen). But
> for scratch builds, it's helpful if you figure out which package it is and
> the config file is used, but you don't mind if such a repo doesn't exist -
> the rpmlint output without any personalized rpmlint.config is still
> valuable. How do you implement that? Will we depend on heuristics
> (recognizing standard vs scratch build) and hard-code whether it errors out
> or not? We can, but will we be always to do the right thing? What if some
> tasks have different preferences? Will we not end up with monster code full
> of if-then-else? It seems to me that it's a better approach to allow the
> task to decide what it wants to do and fully inform it in advance whether
> this is a standard build or a scratch build. That means using different
> item_type and therefore different formula. The advantage is that you can
> easily implement something like "fail_on_error: True/False" option for a
> directive and the task can set what it needs. Or the task will know whether
> it's possible to retrieve epoch for a build and will act accordingly.
>
> As a last example, different logic might be desirable for certain tasks
> even in the "task code" (not libtaskotron code) itself. With abicheck, we
> currently compare new koji build with the latest build in stable updates.
> But for scratch builds, different approach might be desired - e.g.
> comparing to the last standard/scratch build, regardless of the koji tag it
> is in. We actually talked about very this topic the last time I saw Sinny
> and Dodji. If we decided to merge standard builds and scratch builds into a
> single workflow, we don't allow tasks to take a different approach even if
> they wanted (unless they go into such extents like determining whether the
> current "item" is a standard or a scratch build, and then emulating e.g. a
> koji directive in their own code). Btw, what about tasks that want/need to
> run exclusively on just standard or just scratch builds?
>
> >
> > > 3. We can add a new action to the koji directive "download_task". And
> > > then have for each koji task two formulas, one for scratch build
> > > (using "download_task") and one for "real" build" (using
> > > "download_build). This would require writing code for supporting
> > > multiple formulas for one task and would result in having two almost
> > > identical formulas...
> >
> > I'm not a huge fan of 3 if there is any other reasonable way to do
> > things. Code duplication is not the way to go.
>
> I've been thinking about this and I actually like it the most. I already
> described much of the issues with compounding scratch+standard builds in
> the text above, but let me expand on that. I currently believe having a
> separate koji_build and koji_scratch_build type is the best and the most
> technically clean way to go. These two events look very similar on the
> first (and second and third) glance, but they are implemented very
> differently. There's a good reason that their fedmsgs look not alike at
> all. Scratch builds have no restrictions on NEVRs, no uniqueness
> constraints, no dist-git constraints (anything can be built), they do not
> have any API (except a completely generic "task" API), we can't get any
> structured metadata about them. They are very different beasts. Compare
> this to e.g. DNF repo compose, ostree compose, and docker compose. They are
> all "composes", but sufficiently different to get different item types and
> different handling in directives (or probably even some exclusive
> directives). I believe this is the same case.
>
> The code duplication doesn't have to be a problem, I think. In package
> distgit, you'll have koji_build/mytest/runtask.yaml and
> koji_scratch_build/mytest/runtask.yaml. You can symlink any code between
> dirs, if your task is able to handle both. Or you can have separate code,
> if your task needs it (!). The formulas are duplicated, yes, but not
> really. First, they don't change much, so it's not really a problem, and
> second you might want to have a different behavior anyway (e.g. remove
> dist-git directive to download rpmlint.conf, since it's problematic for
> scratch builds).
>
> For non-distgit (generic) tasks, there would be two formulas under
> different names in the same repo (say runtask.yaml and
> runtask_scratch.yaml). We would configure trigger to run the task with
> 'runtask.yaml' for koji builds and 'runtask_scratch.yaml' for scratch
> builds (that should be hopefully trivial). And this is in line with our
> future ideas how scheduling should work - users should be able to configure
> when to run their task in a web UI, and a formula name would be part of
> that. That makes it flexible and would allow them to do awesome things,
> like experiment with a new formula while keeping the old formula still
> working. In short, I believe we want to go this route (multiple formula
> support) anyway. Having it currently "hard-coded" in trigger is not
> necessarily a bad thing, it's a step forward towards having it in a
> database.
>
> Ugh, such a long email. Bad kparal.
> _______________________________________________
> qa-devel mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>

_______________________________________________
qa-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Re: Enabling running Taskotron tasks on Koji scratch builds

Reply via email to