Re: New automated test coverage: openQA tests of critical path updates

2017-03-06 Thread Adam Williamson
On Mon, 2017-02-27 at 10:22 -0800, Adam Williamson wrote:
> Hi folks!
> 
> I am currently rolling out some changes to the Fedora openQA deployment
> which enable a new testing workflow. From now on, a subset of openQA
> tests should be run automatically on every critpath update, both on
> initial submission and on any edit of the update.

Hi again folks! Just a quick update on progress here so far.

The deployment went pretty well, and the tests have been running now
for the last week or so. You can view all the results so far here:

https://openqa.fedoraproject.org/group_overview/2?limit_builds=400

One thing you might notice right away is the list sort order. openQA
currently sorts 'builds' (in this context, the update is the 'build')
on the assumption that they sort as dotted version strings, which
Fedora update IDs (the string we use as the 'build' value for these
tests) certainly don't. I've got a PR in progress upstream to allow us
to sort these differently, and that should get changed soon.

About half way through last week I implemented a change which means any
failed test is automatically retried; this cut down quite a lot on
false failures caused by transient bugs, mirror issues etc. There are
still occasional cases of this, though. For now you can force all the
tests to be re-run by editing the update in any way at all; in future
we'll probably try and set up some system which lets you request re-
runs of failed tests if the failures don't look like an actual bug in
the update.

This week I'm aiming to get the necessary changes made so that Bodhi
will find and display these results alongside the Taskotron results in
its web UI, which should make them much more visible.

There's another significant factor which I hadn't considered: today was
the Bodhi activation point for Fedora 26, meaning we now have Fedora 26
critpath updates we could test.

For now I've decided to go ahead and try and test Branched updates, and
just see how much of a mess it turns out to be. I suspect, though, that
we'll have problems with the tests failing due to underlying bugs far
more often (certainly several of the tests currently fail on Branched,
for instance), and also we'll have problems with the base disk images
much more often for pre-release branches. It may prove to be difficult
or impossible to provide useful feedback for Branched updates with this
system, and if so, we'll turn it off.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-03 Thread Kamil Paral
> On Thu, 2017-03-02 at 04:31 -0500, Kamil Paral wrote:
> > > The job can - and already does - log the exact packages it actually
> > > got, but I don't think there's an easy way for it to take the
> > > 'last_modified' date for the update at the time it does the download.
> > 
> > I don't know how you download the rpms, but a single python call can
> > do that (http get and parse the json). Again, to prevent race
> > conditions, it would be good to do the call before and after
> > downloading the rpms and compare the timestamp. These race conditions
> > occur surprisingly often once you start executing hundreds/thousands
> > tasks a day.
> > 
> > But if this is easier done in the scheduler, I think that's totally fine.
> 
> During test execution, we can only really type stuff into the console.
> We try to keep the amount of typing-into-consoles we do to a minimum,
> too, as the more there is, the more likely it is openQA will choke on a
> keypress and fail. (Though these tests already involve quite a lot of
> typing, can't avoid it.) The test just uses the Bodhi CLI client to
> download the packages.

Sure, I'm not saying this needs to happen during the actual test. That seems 
silly, if we can do the same thing in the scheduler (initial timestamp) and in 
the reporter (end timestamp).

> 
> I mean, it's not impossible, we *could* just type in a curl / Python
> one-liner (or use something like httpie to hit the API to get it). I'm
> just questioning whether it's worth the effort.

It's not necessary now, in the "development" phase. But once we want gate on 
it, I think it's very important (if we want our gating mechanics to be 
reliable).
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-02 Thread Adam Williamson
On Thu, 2017-03-02 at 04:31 -0500, Kamil Paral wrote:
> > The job can - and already does - log the exact packages it actually
> > got, but I don't think there's an easy way for it to take the
> > 'last_modified' date for the update at the time it does the download.
> 
> I don't know how you download the rpms, but a single python call can
> do that (http get and parse the json). Again, to prevent race
> conditions, it would be good to do the call before and after
> downloading the rpms and compare the timestamp. These race conditions
> occur surprisingly often once you start executing hundreds/thousands
> tasks a day.
> 
> But if this is easier done in the scheduler, I think that's totally fine.

During test execution, we can only really type stuff into the console.
We try to keep the amount of typing-into-consoles we do to a minimum,
too, as the more there is, the more likely it is openQA will choke on a
keypress and fail. (Though these tests already involve quite a lot of
typing, can't avoid it.) The test just uses the Bodhi CLI client to
download the packages.

I mean, it's not impossible, we *could* just type in a curl / Python
one-liner (or use something like httpie to hit the API to get it). I'm
just questioning whether it's worth the effort.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-02 Thread Kamil Paral
> > There's one important thing we need to do first, though. Bodhi ID
> > doesn't identify the thing tested uniquely, because Bodhi updates are
> > mutable (and the ID is kept). So Bodhi (or any gating tools) can't
> > rely on just retrieving the latest result for a particular Bodhi ID
> > and trust that result. It might be old and no longer reflect the
> > current state. We need to extend bodhi_update results with
> > "timestamp" key in extra data, that will report the "last_modified"
> > time of the Bodhi update tested. And Bodhi (or any other tool) must
> > not only query for item=$bodhi_id=bodhi_update, but also for
> > =$timestamp. Only with this we can be sure we've really
> > tested particular Bodhi update.
> 
> I'm not so sure it's really necessary, and doing it is actually tricky
> for openQA. Only the openQA job itself knows what packages it actually
> tested, and it doesn't have an easy way to get the associated
> timestamp. The scheduler could easily get the timestamp at the time the
> job was created, or at the time the job completed, but that will never
> be 100% reliable, because the job actually goes and does the download
> somewhere in between those two times.

This problem is not exclusive to openqa, it affects all tasks that test bodhi 
updates and download the included rpms (there's always a race condition 
window). For openqa, I see two options here:

a) record the timestamp in the scheduler when the job is created and use it. 
Either it will be correct, or if the race condition happens, it will publish a 
result based on testing newer packages with an older timestamp. That's slightly 
incorrect, but not really a problem. Because the update edit event scheduled 
another openqa run, and that will publish an up-to-date result. So there's no 
harm done.

b) record the timestamp in the scheduler when the job is created, and when the 
job is finished. If they don't match, ignore the result, don't publish it. The 
update edit event scheduled another openqa run anyway. Again, no harm done and 
we didn't populate resultsdb with an incorrect result. (This is similar to what 
we do in certain taskotron tasks - if we detect that a bodhi update state 
doesn't match at the time when we publish results, we print it into the logs 
and skip them.)

> 
> The job can - and already does - log the exact packages it actually
> got, but I don't think there's an easy way for it to take the
> 'last_modified' date for the update at the time it does the download.

I don't know how you download the rpms, but a single python call can do that 
(http get and parse the json). Again, to prevent race conditions, it would be 
good to do the call before and after downloading the rpms and compare the 
timestamp. These race conditions occur surprisingly often once you start 
executing hundreds/thousands tasks a day.

But if this is easier done in the scheduler, I think that's totally fine.

> 
> OTOH, I don't think it's really too bad just to show the 'most recent'
> results. That should usually only be out of date for a few minutes
> after an update is edited. It might be possible to do a 'tests
> running...' spinner when there are jobs scheduled or running for the
> update in question, even.

You're assuming here that the new task will finish successfully. It will often 
not. From my experience, network is the bane of automated testing. Bodhi will 
time out, koji will time out, they will return http 5xx errors, etc. Taskotron 
tasks are plagued with it (at least dozens such failures a day). That's why I 
try to detect the race condition and either not record it at all, or record it 
with the older timestamp, which is safe - you don't mislead people/tools when 
looking at the results. The worst thing to happen here is that a result is 
missing for a long time. And people will then complain (and we start 
investigating) or they'll use the "request re-testing" button, which we'll have 
to provide sooner or later (because all systems are imperfect).

Of course I'm not saying we need to have this *now*. But I think it's necessary 
for gating updates.
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-02 Thread Jan Sedlak
2017-03-01 18:04 GMT+01:00 Adam Williamson :
> I'm not so sure it's really necessary, and doing it is actually tricky
> for openQA. Only the openQA job itself knows what packages it actually
> tested, and it doesn't have an easy way to get the associated
> timestamp. The scheduler could easily get the timestamp at the time the
> job was created, or at the time the job completed, but that will never
> be 100% reliable, because the job actually goes and does the download
> somewhere in between those two times.

I thought that Bodhi should be the one providing timestamps...
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-01 Thread Adam Williamson
On Wed, 2017-03-01 at 11:18 -0500, Kamil Paral wrote:
> So my first thought was to recommend you to also publish just
> type=koji_build results and finish this transition. But then I
> realized that's wrong. OpenQA operates completely different than the
> aforementioned tasks do. We operate on builds, and can distinguish
> which build is causing issues. Collating them into bodhi_update
> results is just a convenience measure for the consumer. But you
> operate on the whole update as a set. You can't distinguish which
> build of the update caused the issues, you just know that some of
> them did. So the smallest unit for you is bodhi update, and you
> should report results as such.

Yes, exactly.

> The way forward is, I believe, to extend Bodhi to query both
> type=koji_build for all included builds and collate the results (if
> needed), and also query type=bodhi_update and shows those results as
> well. Because different tasks operate on different type of data,
> which influences how they publish the results. And both use cases are
> valid.

Yep, again, this is what I was expecting to do.

> There's one important thing we need to do first, though. Bodhi ID
> doesn't identify the thing tested uniquely, because Bodhi updates are
> mutable (and the ID is kept). So Bodhi (or any gating tools) can't
> rely on just retrieving the latest result for a particular Bodhi ID
> and trust that result. It might be old and no longer reflect the
> current state. We need to extend bodhi_update results with
> "timestamp" key in extra data, that will report the "last_modified"
> time of the Bodhi update tested. And Bodhi (or any other tool) must
> not only query for item=$bodhi_id=bodhi_update, but also for
> =$timestamp. Only with this we can be sure we've really
> tested particular Bodhi update.

I'm not so sure it's really necessary, and doing it is actually tricky
for openQA. Only the openQA job itself knows what packages it actually
tested, and it doesn't have an easy way to get the associated
timestamp. The scheduler could easily get the timestamp at the time the
job was created, or at the time the job completed, but that will never
be 100% reliable, because the job actually goes and does the download
somewhere in between those two times.

The job can - and already does - log the exact packages it actually
got, but I don't think there's an easy way for it to take the
'last_modified' date for the update at the time it does the download.

OTOH, I don't think it's really too bad just to show the 'most recent'
results. That should usually only be out of date for a few minutes
after an update is edited. It might be possible to do a 'tests
running...' spinner when there are jobs scheduled or running for the
update in question, even.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org


Re: New automated test coverage: openQA tests of critical path updates

2017-03-01 Thread Kamil Paral
> Hi folks!
> 
> I am currently rolling out some changes to the Fedora openQA deployment
> which enable a new testing workflow. From now on, a subset of openQA
> tests should be run automatically on every critpath update, both on
> initial submission and on any edit of the update.
> 
> For the next little while, at least, this won't be incredibly visible.
> openQA sends out fedmsgs for all tests, so you can sign up for FMN
> notifications to learn about these results. They'll also be
> discoverable from the openQA web UI - https://openqa.fedoraproject.org
> . The results are also being forwarded to ResultsDB, so they'll be
> visible via ResultsDB API queries and the ResultsDB web UI. But for
> now, that's it...I think.
> 
> Our intent is to set up the necessary bits so that these results will
> show up in the Bodhi web UI alongside the results for relevant
> Taskotron tests. There's an outside possibility that Bodhi is actually
> already set up to find these results in ResultsDB, in which case
> they'll just suddenly start showing up in Bodhi - we should know about
> that soon enough. :) But most likely Bodhi will need a bit of a tweak
> to find them. 

Let me add a bit of a technical background here. Bodhi web UI now queries 
ResultsDB for all available testcases, and then asks for 
item=$item=koji_build for all these testcases. So the new results won't be 
visible (unless you start reporting them per build).

Our depcheck and upgradepath tasks report both type=koji_build (for each SRPM) 
and type=bodhi_update (for each bodhi update). Both these tools internally 
process RPMs or builds, and then query Bodhi at the end and collate the results 
to bodhi_update results. We want to get rid of that collating, because it has 
numerous issues:
1. It's slow, because Bodhi is very slow to respond
2. It often causes the task to fail, because Bodhi often returns 500 errors
3. It's prone to race conditions. It happens often that a Bodhi update is 
edited between the task start and end, changing included builds.
4. It's unnecessary, because Bodhi knows all this information, and can collate 
the data itself, without any network issues or race conditions.

So while we still publish type=bodhi_update for compatibility reasons (I'm not 
sure whether some part of bodhi backend still might use this data), but want to 
get rid of it.

So my first thought was to recommend you to also publish just type=koji_build 
results and finish this transition. But then I realized that's wrong. OpenQA 
operates completely different than the aforementioned tasks do. We operate on 
builds, and can distinguish which build is causing issues. Collating them into 
bodhi_update results is just a convenience measure for the consumer. But you 
operate on the whole update as a set. You can't distinguish which build of the 
update caused the issues, you just know that some of them did. So the smallest 
unit for you is bodhi update, and you should report results as such.

The way forward is, I believe, to extend Bodhi to query both type=koji_build 
for all included builds and collate the results (if needed), and also query 
type=bodhi_update and shows those results as well. Because different tasks 
operate on different type of data, which influences how they publish the 
results. And both use cases are valid.

There's one important thing we need to do first, though. Bodhi ID doesn't 
identify the thing tested uniquely, because Bodhi updates are mutable (and the 
ID is kept). So Bodhi (or any gating tools) can't rely on just retrieving the 
latest result for a particular Bodhi ID and trust that result. It might be old 
and no longer reflect the current state. We need to extend bodhi_update results 
with "timestamp" key in extra data, that will report the "last_modified" time 
of the Bodhi update tested. And Bodhi (or any other tool) must not only query 
for item=$bodhi_id=bodhi_update, but also for =$timestamp. Only 
with this we can be sure we've really tested particular Bodhi update.
___
qa-devel mailing list -- qa-devel@lists.fedoraproject.org
To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org