On 11/15/2013 12:21 AM, Ademar de Souza Reis Jr. wrote:
Hi there.

During the KVM Forum I discussed the future of autotest/virt-test
with the KVM folks there (Lucas, Cleber and Rudá) and we worked
on a list of ideas and features for the mid and long term.

This RFC summarizes our discussions and the features we have in
our backlog. Your feedback and requirements are very welcome.


Motivation / Goals
------------------

     The primary goals of the test automation team working on the
     KVM stack are:

     1. Offer very good and high-quality tools and APIs to the QE
     teams and the KVM stack developers (KVM, QEMU, libvirt, etc)
     who want to implement automated tests.

     2. Maintain a stable test grid running close to (and with
     participation from) KVM developers. With this test grid, we
     want to be able to report and pinpoint issues still during
     the development phases, thus increasing collaboration and
     test coverage.

     In order to accomplish these goals, we are planning some
     major changes to the way autotest and virt-test work. We
     believe these changes will bring the autotest projects to
     "the next level" and bring more cooperation between KVM
     developers and QE. We also believe these changes will make
     the projects useful far beyond the scope of virtualization.

     Below you'll find the list of features we have in our
     backlog. Some of them are work-in-progress, but they're all
     open to feedback, requirements and ideas.


Always-ON test grid
-------------------

The test grid concept is known in the current autotest project.
It's defined more or less as:

    - A set of machines to run tests (scalable), controlled by a
      server that:
         - provision machines for testing (installation and
           hardware inventory)
         - records test results in a central database
         - provides a web and RPC interface for job submission and
           query
         - handle notifications and reporting

We want to extend and refine the concept. The plan is to use
autotest for continuous testing, as part of a continuous
integration setup, with minimum false-positives.

This is a macro-feature and some of the details are further down
this RFC, but the general idea is:

   - New builds or new packages installed on the test environment
     should trigger new test runs
       - run a pre-defined set of tests + random selection
         from the ones considerable stable (time constraint)
   - Non deterministic (or flaky) tests may run as well, as a
     special job
       - e.g.: notify us only if this test breaks in more than 9
           of 10 consecutive runs
   - If a new package breaks a test, isolate it, bisect the
     changes, notify the responsible, rollback to a previous
     version and keep testing the other components.
   - Environment details are saved to allow developers to
     reproduce problems and comparisons can be made to detect
     what has changed since the last time a test passed.
   - Test grid unavailable (halted) in case of infrastructure
     failure
       - Monitoring, dashboard, smart notification system
   - Dashboard with status reports, statistics, charts
     - Performance monitoring
     - What has been the failure rate in the past <month, week>?
     - What's the most fragile package?
     - Who is breaking tests most often?
     - ...


Standalone test harness
-----------------------

The "run" script and the JeOS image, both part of virt-test, have
become extremely popular and proven to be useful in several virt
testing scenarios. It's the first step taken by many in the
adoption of the autotest framework and related projects when
testing the KVM stack and "good enough" for many testing
scenarios where a full test grid and control files may be
considered overkill.

We want to expand its usefulness by making it available outside
of the scope of virtualization, making it more flexible,
configurable and independent of virt-test.

We also want to provide a catalog of multiple JeOS images, or
even integrate it with virt-builder.

As a general guideline, tests run on a testgrid should be
runnable in the standalone test harness, with minimum
bootstrap requirements.


Autotest as a service
---------------------

We want developers (e.g. KVM-stack developers) to be able to push
tests or new packages to a test grid, where tests can be run on
multiple environments and multiplexed (using different variants).

   - Developers should be able to push to a test grid:
       - Packages or git SHA1s (signed)
       - Tests from their own git repositories (signed)
   - They should be able to select hardware and OSes to run the
     tests on (or multiplex the test run on different
     environments)
   - Powerful rpc client (arc), new web UI, dashboard
   - Status queried via command line (arc) and web
   - Result sent via e-mail, available on the web


Separation of test execution from setup requirements
----------------------------------------------------

Autotest today doesn't understand the concept of test setup or
requirements.  As a consequence, we have large test jobs that end
up including all the setup phase and requirements as tests. For
example, we have tests which are responsible for stuff such as:

    - fetching git repositories
    - installing rpm packages from brew
    - building and installing stuff
    - installing guests

So if you need to test, let's say, live migration on windows, you
end up running several tests (fetch from git/yum, build, install
windows) before you get to what you want. If anything fails
during the first steps (e.g.: the "build test"), you get a test
job failure, which is a false positive for you given you're
interested in testing live migration.

We want to change that: tests should have a dependency chain.  A
failure during the setup phase should mean a failure in the
testgrid, not a test failure. A failure in a dependency should
notify the responsible for that particular feature, not the the
final tester.

   - The test grid should have statuses, such as "online",
     "offline" and "halt due to infrastructure problem", with a
     queue of test jobs

   - Tests should be tagged with requirements, so that some of
     them can be declared incompatible with the test grid they're
     on (see below for more details on this specific item)
- Build failures should be bisected, the responsible for the
     error notified, the broken component reverted to a previous
     version (if possible) and tests resumed. Otherwise, the test
     grid should be halted and admins notified
- An infrastructure problem (e.g. network error) should halt
     the test grid and notify the admins

   - Tests should have optional setup() and teardown() functions
     so that test writers can prepare the environment and clean
     it up after test execution

   - The test grid should have reference images to restore test
     environments to a pristine state if something goes wrong

This features sounds good. But have some comments about this:
- Sometimes the prepare tasks in one test loop can be the actually test steps in another one. Just like the install guests. So I want to confirm that we are doing an update in test object in autotest, but not only tag some tests and mark them like this can be failed. - When we do this we should also consider that someone may only use the client tests in their local machines not all autotest. So when an infrastructure problem happen, we need be careful to the how we charge them. It should not influence others run in that host at least. - About handle different kind of failed. I think we should make sure we include the very basic reason that cause the failed. Now in autotest sometime the failed is hard to analyse as it is jumps around the codes without enough informations. And sometimes one error maybe covered by another one which is in the error handling code.


Test Dependencies
------------------

We want to allow test writers to specify what their test requires
and test-grid admins to declare what their test grid environment
provides. This way it becomes trivial to list which tests are
compatible with a given environment and we reduce the number of
test-failures.

The plan is to implement a tagging system, where test writers
introduce the tag and document what it means, while test-grid
admins provide the tag if they have the requirements.

   - If the requirements are not present in the test grid, the
     test is not compatible with it and should not be run
   - We should define a common setup so that, by default, most
     current standard tests won't need any tag and should keep
     working as they are

   - Examples of hypothetical tests requirements:

       - a new kernel feature:
           root # need to run as root
           kernel-3.3.10+ # linux kernel version

Actually this one is kind of already implement in func package_check() of qemu/control.kernel-version. We can configure the package that we need in host by it. Just now the performance of the function is bad. If you are interested in that, I also like to spend time to improve it.


       - virtio-win bug fix that requires Win7SP1+:
           windows-7sp1+ # windows version
           win_fresh_install # new windows installation
           virtiowin-2013-10-24+ # virtio-win version

       - VGA pass-through, specific hardware attached:
           fresh_reboot # machine rebooted before the test
           DELL_R420 # machine specification (hardware)
           NVIDIAGTX999GX8GB_at_PCIePORT1
                 # a very specific card attached to a specific port
           ...

Drop-in tests
-------------

We want to reduce the barrier to write a simple test to the
minimum possible. The plan is to provide support for
language-agnostic scripts that do something and return 0 or
error.  No dependencies from autotest at all for something
trivial.

   - In virt mode, runs inside the guest, fully instrumented
     and multiplexed. E.g.: a one-liner "grep <flag> /proc/cpuinfo"
     (run on guest) would be a valid test
       - Multiplexed: runs on multiple hardware and guest
         configurations
       - Instrumented: logs, video record, historical results in the
         database
       - We may allow the combo of guest+host scripts on a single
         test (similar to the original proposal of qemu-tests)
   - Users can start growing the complexity of their tests:
       - using environment variables
       - python: import virttest # some goodies from autotest
       - python: def run(...) # full support in virt-test
       - python: create your own control file, etc.


Out-of-tree tests
-----------------

Users should be free to implement their tests outside of the
autotest repositories if they want (either in their own
repository, or in the same repository where the code being tested
resides).

This will require autotest API stability and versioning, so that
users can run their tests using a known stable version of the API
and tools.


Virt testing API
----------------

We have a lot of code in the virt-test APIs which is shared
between several virt-test tests. Given the amount of new tests
and changes we have in virt-tests, we suffer with API instability
and some code churn.

We want to make the virt testing API stable (or at least part of
it), with predictable behavior, documentation and proper
versioning.

----- x -----

As you all can see, this is an ambitious list that will require
many changes to the autotest framework and virt-tests codebases.

We've been working on some of these features already, but the
main, disruptive changes will take place in the next months and
may take a year or more to be in good shape, ready to be
released.

To handle the transition with minimal disruption to current
autotest users and test writers, our plan is to:

   - Make a stable release of autotest and freeze both the API and
     the behavior of the tools, while we start working on the more
     disruptive changes in an unstable branch.

   - Start the separation of virt-tests into 1) a stable API and
     2) upstream tests that make use of it (together with parts of
     the API which may require constant changes). We expect to
     spend the next 2 or 3 months on this split and we don't know
     where the line will be drawn yet. Once done, we'll release a
     stable virt-test API and start working on the more intrusive
     changes in a new branch.

I think this part is the most part I concern. Does the stable API will also under development? I mean after that part of API claimed as stable, is anyone will change it when the tests and unstable API growing fast?

The problem is sometime it is hard to find out if the API is stable enough. We met all kinds of problem in our testing. Just take aexpect.py as an example. In the beginning we consider that as a very basic and stable module. But recently as we did more tests, we find it can not deal with the kernel pop up msg in the console. So if we separated the repos and fix them in both repos before the merge window. there must be conflict. And as virt-test tests growing fast. Every 3~6 months could have a very big gap. Do you guys still remember what happen in the middle of this year? When we decide to push all API bug fix and development and cases into upstream. It takes about one month for the API udpate, conflict resolve. And in that time the autotest tree is very unstable, as the codes are not well tested in new API, we even spent more time for bug fixing for it.

pors and cons for split:
pors: We get a smaller set of APIs which are easier to learn and use.
cons: - Users(tests developers) may works on old APIs and ignore the things happen in another repo. So duplicate works and conflicts. - New APIs changed in API repo may not get well tested as everyone use the old APIs - Not well tested APIs merged in tree will cause fail in test cases which means every test developer face to unstable and bug fix period after API update. Especially for whom works with private cases.
         - Need more people handle the merge things

So if we can not separated it clearly and without any further conflicts. Is it really worth us to do it? Can we just get the goal by another way? Just like kind of whitelist in autotest or virt-test?

Maybe we can write done a small scripts to get the introduction about virt-test APIs. It generate the functions and function descriptions based on the docstring in the code. And we can mark our APIs with tags like stable, unstable, nics, images and etc. So when people first start with virt-test or autotest. They can get the very basic functions which are stable with that tools to help them to build up their own test cases. Then when they want to do more complex cases as nics or blocks tests, they can also list the useful APIs provide by our framework. And the results shows that which of them are stable and which of them may have problem. This scripts also can be used during our development. When we are touching a stable function virt-test can raise an warning sign when we use check_patch.py. It will tell the user that you are modify a stable function need pay more attention.


   - Before we make the first release of the autotest framework
     with the new features and the new virt-test API, we'll spend
     the necessary amount of time making sure all upstream tests
     (virt-tests) are compatible with the new features and changes
     being introduced by the framework.

Comments, questions, concerns?

Thanks.
     - Ademar


_______________________________________________
Virt-test-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/virt-test-devel

Reply via email to