W dniu 19.10.2012 01:36, Michael Hudson-Doyle pisze:
Incidentally that's something we may collaborate on.
Yeah, so how does checkbox deal with this? I guess it doesn't quite
have the concept of remote users submitted requests that jobs are run?
(i.e. checkbox is more dispatcher than scheduler in lava terminology).
We have largely the same problem but in different context (there are
different internal users).
Checkbox has the concept of "whitelists" which basically specify the
test scenario. Each item in the whitelist is a "job" (full test
definition) that can use various checkbox "plugins" (like shell, manual
and many others that I'm not familiar with). Checkbox then transforms
the whitelist (resolving dependencies and things like that) and executes
the tests much like dispatcher would.
I see.
There are several use cases that are currently broken
Such as?
From what I recall mostly on the way upstream/downstream (and sometimes
side-stream) relationships work. The actual details are specific to
Canonical (I would gladly explain that in a private channel if you wish
to know more) but the general idea is that without some API stability
(and we offer none today), script stability (you can think of it as
another level of API) our downstream users (which are NOT just
consumers) have a hard time following our releases.
The second issue that is more directly addressed is that there is poor
conductivity for actual tests to flow from team to team and to get
"stability" people prefer to keep similar/identical tests to themselves
(not as in secret but as in not collaborated upon easily)
One of the proposals would be to build a pypi-like directory of tests
and use that as a base for namespacing (first-come first-served name
allocation). I'm not entirely sure this would help to solve the problem
but it's something that, if available, could give us another vector.
Hm. This is definitely an interesting idea. I had actually already
thought that using user specified distutils- or debian-style versioning
would make sense -- you would get the latest version by the chosen
algorithm by default, but could still upload revisions of old versions
if you wanted to.
I'd rather avoid debian-style versions in favor of strict, constant
length, version system. Let's not have a custom postgresql function for
comparing versions again ;)
Part of this would be a command line tool for fetching / publishing test
definitions I guess. In fact this could almost be the main thing: it
depends whether you want to produce (and host, I guess) a single site
which is the centrepoint of the test definition world (like
pypi.python.org is for Python stuff) or just the tools / protocols
people use to run and work with their own repositories
(testdef.validation.linaro.org or testdef.qa.ubuntu.com or whatever).
I think that there _should_ be a central repository simply because it
means less fractures early on. From what I know people don't deploy
their own pypi just to host their pet project. They only do that if they
depend on the protocols and tools around pypi and want to keep the code
private.
I think that, as with pypi, even if there is a "single centrepoint of
the test definition world", we should expect that sites will have local
test repositories for one reason and another (as they do with pypi).
Having said what I did above, nothing can prevent others from
re-implementing the same protocols or deploying their own archive but I
think we should encourage working in the common pool as this will
improve the ecosystem IMHO (look at easy_install, pip or even crate.io,
they would not have happened if there was a competing group of pypi-like
systems that have no dominance over others). In other words the value of
pypi is the data that is stored there.
Another way to handle namespacing is to include the name of the user /
group that can update a resource in its name, ala branches on LP or
repos on github (or bundle streams in LAVA). Not sure if that's a good
idea for our use case or not.
I thought about one thing that would warrant ~user/project approach.
Both pypi and launchpad are product-centric -- you go to shop for
solutions looking for the product name. GitHub on the other hand is
developer centric as $product can have any number of forks that are
equally exposed.
I think for our goals we should focus on product-centric views. The
actual code, wherever it exists, should be managed with other tools. I
would not like to outgrow this concept to a DVCS or a code hosting tool.
I wonder if checkbox's rfc822ish format would be better than JSON for
test interchange...
Probably although it's still imperfect and suffers from binary deficiency.
What I'd like to see in practice is a web service that is free-for-all
that can hold test meta data. I believe that as we go test meta data
will formalize and at some point it may become possible to run lava-test
test from checkbox and checkbox job in lava (given appropriate adapters
on both sides) merely by specifying the name of the test.
So that's an argument for aiming for a single site? Maybe. Maybe you'd
just give a URL of a testdef rather than the name of a test, so
http://testdef.validation.linaro.org/stream rather than just 'stream'.
Imagine pip installing that each time. IMO it's better to stick to names
rather than URLS, if we can. People know how to manage names already and
URLs is something we can only google for.
The full URL could be usable for some kind of "packages" but that's not
the primary scope of the proposal, I think. Packages are more
complicated and secondary and the directory should merely point you at
something that you can install with an absolute URL.
Initially it could be a simple RESTful interface based on a dumb HTTP
server serving files from a tree structure.
And then could grow wiki like features? :-)
I'd rather not go there. IMHO it should only have search and CRUD
actions on the content. Anything beyond that works better elsewhere
(readthedocs / crate.io). Remember that it's not the 'appstore'
experience that we are after here. The goal is to introduce a common
component that people can converge and thrive on. This alone may give us
better code re-usability as we gain partial visibility to other
developers _and_ we fix the release process for test definitions so that
people can depend on them indefinitely.
One of the user stories we have is "which tests are available to run on
board X with Y deployed to it?" -- if we use test repositories that are
entirely disconnected from the LAVA database I think this becomes a bit
harder to answer. Although one could make searching a required feature
of a test repository...
I think that's something to do in stage 2 as we get a better
understanding of what we have. In the end the perfect solution, for
LAVA, might be LAVA-specific and we should not sacrifice the generic
useful aspects in the quest for something this narrow.
In simple classifiers that might help there:
Environment::Hardware::SoC::OMAP35xx
Environment::Hardware::Board::Panda Board ES
Environment::Hardware::Add-Ons::Linaro::ABCDXYZ-Power-Probe
Environment::Software::Linaro::Ubuntu Desktop
Environment::Software::Ubuntu::Ubuntu Desktop
But this requires building a sensible taxonomy which is something I
don't want to require in the first stage. The important part is to be
_able_ to build one as the meta-data format won't constrain you. As we
go we can release "official" meta-data spec releases that standardize
what certain things mean. This could them be used as a basis for
reliable (as in no false positives) and advanced search tools.
This would allow us to try moving some of the experimental meta-data
there and build the client parts. If the idea gains traction it could
grow from there.
Some considerations:
1) Some tests have to be private. I don't know how to solve that in
namespaces. Some of the ideas that come to mind is .private. namespace
that is explicitly non-global and can be provided by a local "test
definition repository"
That would work, I think.
2) It should probably be schema free, serving simple rfc822 files with
python-like classifiers (Test::Platform::Android anyone?) as this will
allow free experimentation
FWIW, I think they're pedantically called "trove classifiers" :-)
Right, thanks!
I guess there would be two mandatory fields: name and version. And
maybe format? So you could have
Yeah, name and version is a good start. Obviously each test definition
will have a maintainer / owner but that's not something that has to be
visible here (and it certainly won't be a part of what gets published
"to the archive" if we go that far).
Name: stream
Version: 1.0b3
Format: LAVA testdef version 1.3
We could also prefix all non-standard (non standardized) headers with
the vendor string (-Linaro -Canonical) or have a standard custom
extension header prefix as in HTTP, X-foo
...
and everything else would only need to make sense to LAVA.
Then you would say client side:
$ testdef-get lava-stream
We definitely need a catchy name
But seriously. I'm not entirely sure that the command line tool will be
a part of the "standard issue". The same way you use pip to install
python stuff from pypi you'd use lava to install test definitions into
lava. I don't imagine how a generic tool could know how to interact with
lava and checkbox in a way that would still be useful. While your
example is strictly about running tests (it's about defining them) I
think it's important to emphasize -- the protocols, and maybe the common
repo, matter more than the tools as those may be more domain-specific
for a while.
Fetched lava-stream version 1.0b3
$ vi lava-stream.txt
# update stuff
$ testdef-push lava-stream.txt
ERROR: lava-stream version 1.0b3 already exists on server
$ vi lava-stream.txt
# Oops, update version
$ testdef-push lava-stream.txt
Uploaded lava-stream version 1.0b4
I wonder if we could actually cheat and use pypi to prototype this. I
don't suppose they have a staging instance where I can register 20 tiny
projects with oddball meta-data?
3) It should (must?) have pypi-like version support so that a test can
be updated but the old definition is never lost.
Must, imho. I guess support for explicitly removing a version would be
good, but the default should be append-only.
No disagreement here
4) It probably does not have to be the download server as anyone can
host tests themselves. Just meta-data would be kept there.
By metadata you mean the key-value data as listed above, right?
Yes
(For small tests that may be enough but I can envision tests with
external code and resources)
Yeah, the way lava-test tests can specify URLs and bzr and git repos to
be fetched needs to stay I think.
That's the part I hate the most about current LAVA setup. I think that
going forward they should go away and should be converted into test
definitions that describe the very same code you'd git clone or bzr
branch. The reason I believe that is that it will allow you do to
reliable releases. This is the same distinction as pypi not having any
tarballs, just git urls. I think that would defeat the long term purpose
of the directory. Remember that both the test "wrapper" / definition and
the test code is something that gets consumed by users/testers so _both_
should be released in the same, reliable, way.
In addition to that, having "downloads" makes offline easier. I'm not
entire sure how that would work with very high level tests that, say,
apt-get install something from the archive and then run some arbitrary
commands. One might be tempted to create a reproducible test environment
where all the downloads are kept offline and versioned but perhaps that
kind of test needs to be explicitly marked as non-idempotent and that's
the actual value it provides.
Thanks
ZK
--
Zygmunt Krynicki
s/Linaro Validation Team/Canonical Certification Team/
s/Validation/Android/
_______________________________________________
linaro-validation mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-validation