Re: [gentoo-dev] Automatic testing on Gentoo

2011-05-11 Thread Alec Warner
On Wed, May 11, 2011 at 6:12 AM, Jack Morgan  wrote:
>
>
> On 05/10/2011 01:13 PM, Jorge Manuel B. S. Vicetto wrote:
>> Hi.
>>
>> Another issue that was raised in the discussion with the arch teams,
>> even though it predates the arch teams resources thread as we've talked
>> about it on FOSDEM 2011 and even before, is getting more automatic
>> testing done on Gentoo.
>>
>> I'm bcc'ing a few teams on this thread as it involves them and hopefully
>> might interest them as well.
>>
>> Both Release Engineering and QA teams would like to have more automatic
>> testing to find breakages and to help track "when" things break and more
>> importantly *why* they break.
>>
>> To avoid misunderstandings, we already have testing and even automated
>> testing being done on Gentoo. The "first line" of testing is done by
>> developers using repoman and or the PM's QA tools. We also have
>> individual developers and the QA team hopefully checking commits and
>> everyone testing their packages.
>>
>> Furtermore, the current weekly automatic stage building has helped
>> identify some issues with the tree. The tinderbox work done by Patrick
>> and Diego, as well as others, has also helped finding broken packages
>> and or identifying packages affected by major changes before they hit
>> the tree. The use of repoman, pcheck and or paludis quality assurance
>> tools in the past and present to generate reports about tree issues,
>> like Michael's (mr_bones) emails have also helped identifying and
>> addressing issues.
>>
>> Recently, we've got a new site to check the results of some tests
>> http://qa-reports.gentoo.org/ with the possibility to add more scripts
>> to provide / run even more tests.
>>
>> So, why "more testing"? For starters, more *automatic* testing. Then
>> more testing as reports from testing can help greatly in identifying
>> when things break and why they break. As someone that looks over the
>> automatic stage building for amd64 and x86, and that has to talk to
>> teams / developers when things break, having more, more in depth and
>> regular automatic testing would help my (releng) job. The work for the
>> live-dvd would also be easier if the builds were "automated" and the job
>> wasn't "restarted" every N months. Furthermore, creating a framework for
>> developers to be able to schedule testing for proposed changes, in
>> particular for substantial changes, might (should?) help improve the
>> user's experience.
>>
>> I hope you agree with "more testing" by now, but what testing? It's good
>> to test something, but "what" do we want to test and "how" do we want to
>> test?
>>
>>
>> I think we should try to have at least the following categories of tests:
>>
>>  * Portage (overlays?) QA tests
>>       tests with the existing QA tools to check the consistency of
>> dependencies and the quality of ebuilds / eclasses.

These are almost separate. I assume your intent was 'lets automate
pcheck & co. runs of gentoo-x86 and if we get that working we can add
overlays from layman' which sounds fine to me ;)

>>
>>  * (on demand?) package (stable / unstable) revdep rebuild (tinderbox)
>>       framework to schedule testing of proposed changes and check their 
>> impact

I'd be curious what the load is here. We are adopting an on-demand
testing infrastructure at work.  Right now we have a continuous build
but it is time-delta based and not event-based so it groups changes
together which makes it hard to find what broke things. At work we
only submit a few changes a day though, so we need a very small
infrastructure to test each change. Gentoo has way more commits (at
least one every few minutes on average, and then there are huge
commits like KDE stablization...)

What I'd recommend here is essentially some kind of control field in
the commit itself (commitmsg?) that controls exactly what tests are
done for that commit.

>>
>>  * Weekly (?) stable / unstable stage / ISO arch builds
>>       the automatic stage building, including new specs for the testing tree
>> as we currently only test the stable tree.

I'm curious if you constantly build unstable..do you plan on fixing
it? My understanding of Gentoo is that in ~arch something is always
slightly broken and thats OK. I worry that ~arch builds may just end
up being noise because they don't build properly due to the high
velocity of changes.

>>
>>  * (schedule?) specific tailored stage4 builds
>>       testing of specific tailored "real world" images (web server, intranet
>> server, generic desktop, GNOME desktop, KDE desktop, etc).

Again it would be interesting to have some kind of control field in my
commits so when KDE is stable I could trigger a build of the 'KDE
stage4' or whatnot.

If we ever finish this gentoo-stats project it would be interesting to
see what users are actually using as well. Do users use the defaults?
Are the stage4's we are testing actually relevant?

>>
>>  * Bi-Weekly (?) stable / unstable AMD64/X86 LiveDVD builds
>>       automatic crea

Re: [gentoo-dev] Automatic testing on Gentoo

2011-05-11 Thread Jack Morgan


On 05/10/2011 01:13 PM, Jorge Manuel B. S. Vicetto wrote:
> Hi.
> 
> Another issue that was raised in the discussion with the arch teams,
> even though it predates the arch teams resources thread as we've talked
> about it on FOSDEM 2011 and even before, is getting more automatic
> testing done on Gentoo.
> 
> I'm bcc'ing a few teams on this thread as it involves them and hopefully
> might interest them as well.
> 
> Both Release Engineering and QA teams would like to have more automatic
> testing to find breakages and to help track "when" things break and more
> importantly *why* they break.
> 
> To avoid misunderstandings, we already have testing and even automated
> testing being done on Gentoo. The "first line" of testing is done by
> developers using repoman and or the PM's QA tools. We also have
> individual developers and the QA team hopefully checking commits and
> everyone testing their packages.
> 
> Furtermore, the current weekly automatic stage building has helped
> identify some issues with the tree. The tinderbox work done by Patrick
> and Diego, as well as others, has also helped finding broken packages
> and or identifying packages affected by major changes before they hit
> the tree. The use of repoman, pcheck and or paludis quality assurance
> tools in the past and present to generate reports about tree issues,
> like Michael's (mr_bones) emails have also helped identifying and
> addressing issues.
> 
> Recently, we've got a new site to check the results of some tests
> http://qa-reports.gentoo.org/ with the possibility to add more scripts
> to provide / run even more tests.
> 
> So, why "more testing"? For starters, more *automatic* testing. Then
> more testing as reports from testing can help greatly in identifying
> when things break and why they break. As someone that looks over the
> automatic stage building for amd64 and x86, and that has to talk to
> teams / developers when things break, having more, more in depth and
> regular automatic testing would help my (releng) job. The work for the
> live-dvd would also be easier if the builds were "automated" and the job
> wasn't "restarted" every N months. Furthermore, creating a framework for
> developers to be able to schedule testing for proposed changes, in
> particular for substantial changes, might (should?) help improve the
> user's experience.
> 
> I hope you agree with "more testing" by now, but what testing? It's good
> to test something, but "what" do we want to test and "how" do we want to
> test?
> 
> 
> I think we should try to have at least the following categories of tests:
> 
>  * Portage (overlays?) QA tests
>   tests with the existing QA tools to check the consistency of
> dependencies and the quality of ebuilds / eclasses.
> 
>  * (on demand?) package (stable / unstable) revdep rebuild (tinderbox)
>   framework to schedule testing of proposed changes and check their impact
> 
>  * Weekly (?) stable / unstable stage / ISO arch builds
>   the automatic stage building, including new specs for the testing tree
> as we currently only test the stable tree.
> 
>  * (schedule?) specific tailored stage4 builds
>   testing of specific tailored "real world" images (web server, intranet
> server, generic desktop, GNOME desktop, KDE desktop, etc).
> 
>  * Bi-Weekly (?) stable / unstable AMD64/X86 LiveDVD builds
>   automatic creation of the live-DVD to test a very broad set of packages
> 
>  * automated testing of built stage / CD / LiveDVD (KVM guest?) (CLI /
> GUI / log parsing ?)
>   framework to test the built stages / install media and ensure it works
> as expected
> 
> 
> I don't have a framework for conducting some of these tests, including
> the stage/iso validation, but some of them can use the existing tools
> like the stage building and the tree QA tests.
> 
> Do you have any suggestions about the automatic testing? Do you know of
> other tests or tools that we can and should use to improve QA on Gentoo?

You might take a look at autotest from kernel.org. It's a Python based
framework for automating testing. It's specific towards kernel testing,
but could be modified for your needs.




-- 
Jack Morgan
Pub 4096R/761D8E0A 2010-09-13 Jack Morgan 
Fingerprint = DD42 EA48 D701 D520 C2CD 55BE BF53 C69B 761D 8E0A



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Automatic testing on Gentoo

2011-05-10 Thread Chris Richards

On 05/10/2011 03:13 PM, Jorge Manuel B. S. Vicetto wrote:

So, why "more testing"? For starters, more *automatic* testing. Then
more testing as reports from testing can help greatly in identifying
when things break and why they break. As someone that looks over the
automatic stage building for amd64 and x86, and that has to talk to
teams / developers when things break, having more, more in depth and
regular automatic testing would help my (releng) job.
While I agree whole-heartedly with the sentiment being expressed here, I 
just want to point out and remind everyone that automated testing is no 
substitute for real live people using (and breaking) things.  People are 
remarkably inventive and creative when it comes to finding ways to break 
things in ways that the developers never even considered.


All I'm trying to say is that I've seen (and worked on) far too many 
teams in the past that fell into the trap of thinking automated testing 
was sufficient.  It isn't, but it certainly goes a long way towards 
helping make the manual tester's lives better, by letting them focus on 
finding those problems that aren't (for one reason or another) 
reproducible in an automated testing scenario.


Later,
Chris



[gentoo-dev] Automatic testing on Gentoo

2011-05-10 Thread Jorge Manuel B. S. Vicetto
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi.

Another issue that was raised in the discussion with the arch teams,
even though it predates the arch teams resources thread as we've talked
about it on FOSDEM 2011 and even before, is getting more automatic
testing done on Gentoo.

I'm bcc'ing a few teams on this thread as it involves them and hopefully
might interest them as well.

Both Release Engineering and QA teams would like to have more automatic
testing to find breakages and to help track "when" things break and more
importantly *why* they break.

To avoid misunderstandings, we already have testing and even automated
testing being done on Gentoo. The "first line" of testing is done by
developers using repoman and or the PM's QA tools. We also have
individual developers and the QA team hopefully checking commits and
everyone testing their packages.

Furtermore, the current weekly automatic stage building has helped
identify some issues with the tree. The tinderbox work done by Patrick
and Diego, as well as others, has also helped finding broken packages
and or identifying packages affected by major changes before they hit
the tree. The use of repoman, pcheck and or paludis quality assurance
tools in the past and present to generate reports about tree issues,
like Michael's (mr_bones) emails have also helped identifying and
addressing issues.

Recently, we've got a new site to check the results of some tests
http://qa-reports.gentoo.org/ with the possibility to add more scripts
to provide / run even more tests.

So, why "more testing"? For starters, more *automatic* testing. Then
more testing as reports from testing can help greatly in identifying
when things break and why they break. As someone that looks over the
automatic stage building for amd64 and x86, and that has to talk to
teams / developers when things break, having more, more in depth and
regular automatic testing would help my (releng) job. The work for the
live-dvd would also be easier if the builds were "automated" and the job
wasn't "restarted" every N months. Furthermore, creating a framework for
developers to be able to schedule testing for proposed changes, in
particular for substantial changes, might (should?) help improve the
user's experience.

I hope you agree with "more testing" by now, but what testing? It's good
to test something, but "what" do we want to test and "how" do we want to
test?


I think we should try to have at least the following categories of tests:

 * Portage (overlays?) QA tests
tests with the existing QA tools to check the consistency of
dependencies and the quality of ebuilds / eclasses.

 * (on demand?) package (stable / unstable) revdep rebuild (tinderbox)
framework to schedule testing of proposed changes and check their impact

 * Weekly (?) stable / unstable stage / ISO arch builds
the automatic stage building, including new specs for the testing tree
as we currently only test the stable tree.

 * (schedule?) specific tailored stage4 builds
testing of specific tailored "real world" images (web server, intranet
server, generic desktop, GNOME desktop, KDE desktop, etc).

 * Bi-Weekly (?) stable / unstable AMD64/X86 LiveDVD builds
automatic creation of the live-DVD to test a very broad set of packages

 * automated testing of built stage / CD / LiveDVD (KVM guest?) (CLI /
GUI / log parsing ?)
framework to test the built stages / install media and ensure it works
as expected


I don't have a framework for conducting some of these tests, including
the stage/iso validation, but some of them can use the existing tools
like the stage building and the tree QA tests.

Do you have any suggestions about the automatic testing? Do you know of
other tests or tools that we can and should use to improve QA on Gentoo?

- -- 
Regards,

Jorge Vicetto (jmbsvicetto) - jmbsvicetto at gentoo dot org
Gentoo- forums / Userrel / Devrel / KDE / Elections / RelEng
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJNyZx3AAoJEC8ZTXQF1qEPqoIQAKxIUHJItLX2HCgqbjmOMOTu
P7Losyu6bAi9ndtyRGYwlmEHSRBgbrkHyllx2GCMj6vR20HBYWUiUaFn3QIghLLq
2d1Z75zzL6FN9IQvAM8BgQWEj7+Fe24MdOhx8knQmXzZn4jffzxeI/Clw/YzfxWd
7uVNWh2x48+/susjLhrkpmbQfcvuSuwK/qzhMsfJcbL5G0rHweoXtOI6L2fvLd/8
VxwtNPRm0lemB2DSifN5zmDiWe7Z1Tb+qnb7XZrj4KgJB154dbnpIirqW6eilYz7
zDVzGtjRm5MdRHzNxcHZ0M1XqR0N9BcwBBsqyh2Qhr6y8W8BX7gnqC/OuT+2LPOi
HzvZ4sbGq2uq6/Fqjnyv9yWtqVNDjlJI2WjuZSsmZJaPVr/zSUptPfJEO/Qdla98
6aC7zdZucQAG8ai6KccttsaVv2N9Q5YAmZygBsiMjBZqNMfb8vsxN8VtDattd16Y
ICnYBIyAxkazI94dp0dAuX429c+9+jTYZVMmGSbMQ8I/jFayEkpvim9wmCtIG+nx
aySk+CKUpBFxF+nAttO0NEnM5oNtoNNx8k4VtMLRVyUG/LDK7z4p1OGocGZ1uELq
+0aNNrY3qmDK4Yq0ID5bhp/gppn7PGrJBvm7zrqXUk7lVqs3NJHFSz4NLNIp41le
o0qGl3+8Mhbns1mljpmx
=sWpj
-END PGP SIGNATURE-