Re: [Avocado-devel] Parameter System Overhaul

Cleber Rosa Wed, 16 Aug 2017 09:26:11 -0700


On 08/08/2017 07:01 AM, Lukáš Doktor wrote:
> Hello guys,
> 
> I'm sorry for such a late response, I totally forgot about this email (thanks 
> to Ademar, your response reminded it to me).
> 
> Dne 7.8.2017 v 23:49 Ademar Reis napsal(a):
>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
>>> Even though Avocado has had a parameter passing system for
>>> instrumented tests almost from day one, it has been intertwined with
>>> the varianter (then multiplexer) and this is fundamentally wrong.  The
>>> most obvious example of this broken design is the `mux-inject` command
>>> line option::
>>>
>>>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
>>>                         Inject [path:]key:node values into the final
>>> multiplex
>>>                         tree.
>>>
>>> This is broken design not because such a varianter implementations can
>>> be tweaked over the command line, that's fine.  It's broken because it
>>> is the recommended way of passing parameters on the command line.
>>>
>>> The varianter (or any other subsystem) should be able to act as a
>>> parameter provider, but can not dictate that parameters must first be
>>> nodes/key/values of its own internal structure.
>>
>> Correct. It's broken because it violates several layers. There would
>> be nothing wrong with something like "--param [prefix:]<key:value>",
>> for example (more below).
>>
> Well I wouldn't call it broken. The implementation is fine we only lack other 
> providers which would allow to inject just params so people are abusing 
> `mux-inject` for that.
> 
>>>
>>> The proposed design
>>> ===================
>>>
>>> A diagram has been used on a few different occasions, to describe how
>>> the parameters and variants generation mechanism should be connected
>>> to a test and to the overall Avocado architecture.  Here it is, in its
>>> original form::
>>>
>>>    +------------------------------+
>>>    | Test                         |
>>>    +------------------------------+
>>>              |
>>>              |
>>>    +---------v---------+    +--------------------------------+
>>>    | Parameters System |--->| Variants Generation Plugin API |
>>>    +-------------------+    +--------------------------------+
>>>              ^                                ^
>>>              |                                |
>>>    +--------------------------------------+   |
>>>    | +--------------+ +-----------------+ |   |
>>>    | | avocado-virt | | other providers | |   |
>>>    | +--------------+ +-----------------+ |   |
>>>    +--------------------------------------+   |
>>>                                               |
>>>                  +----------------------------+-----+
>>>                  |                                  |
>>>                  |                                  |
>>>                  |                                  |
>>>        +--------------------+           +-------------------------+
>>>        | Multiplexer Plugin |           | Other variant plugin(s) |
>>>        +--------------------+           +-------------------------+
>>>              |
>>>              |
>>>        +-----v---------------------------+
>>>        | +------------+ +--------------+ |
>>>        | | --mux-yaml | | --mux-inject | |
>>>        | +------------+ +--------------+ |
>>>        +---------------------------------+
>>>
>>>
>>> Given that the "Parameter System" is the entry point into the parameters
>>> providers, it should provide two different interfaces:
>>>
>>>  1) An interface for its users, that is, developers writing
>>>     `avocado.Test` based tests
>>>
>>>  2) An interface for developers of additional providers, such as the
>>>     "avocado-virt" and "other providers" box on the diagram.
>>>
>>> The current state of the the first interface is the ``self.params``
>>> attribute.  Hopefully, it will be possible to keep its current interface,
>>> so that tests won't need any kind of compatibility adjustments.
>>
>> Right. The way I envision the parameters system includes a
>> resolution mechanism, the "path" currently used in params.get().
>> This adds extra specificity to the user who requests a parameter.
>>
>> But these parameters can be provided by any entity. In the diagram
>> above, they're part of the "Parameter System" box. Examples of
>> "other providers" could be support for a configuration file or a
>> "--param=[prefix:]<key:value>" command line option.
>>
>> The Test API to request parameters should be shared. To a test it
>> doesn't matter where the parameter is coming from. They're always
>> accessed through an API like:
>>
>>     params.get(<key>, [prefix], [fallback value])
>>
>> Where:
>>   * key (mandatory) is the configuration variable the user wants.
>>
>>   * prefix (optional) is used to find the right key and can be
>>     partially resolved, from right to left. As an example, a
>>     `params.get("bar", "name")` would be fulfilled by the parameter
>>     system if there are entries like "bar:name", "foobar:name" or
>>     "/foo/bar:name". We're using '/' in the multiplexer to
>>     separate branch levels, thus instead of "prefix", we're calling
>>     it "path", but the mechanism is the same and IMO should be
>>     generic to avoid confusion (a path at the parameter level is
>>     different than a path at the multiplexer level).
>>
>>     params.get() can be even more flexible, supporting regexps and
>>     blobs...  The objective of [prefix] is to serve as a filter and
>>     to guarantee the caller is getting the variable he wants. It's
>>     similar to a namespace.
>>
>>   * fallback (optional) is the value returned if the parameter
>>     system can't resolve prefix+key.
>>
>> Users who don't want any specificity and/or have a small test base
>> with a low chance of clashes could simply ignore the prefix both
>> when creating parameters and when making calls to params.get().
>>
>>>
>>> The second item will probably mean the definition of a new class to
>>> the ``avocado.core.plugin_interfaces`` module, together with a new
>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
>>>
>>> Parameters availability: local .vs. external
>>> ============================================
>>>
>>> Right now, all parameters are given to the test at instantiation time.
>>> Let's say that in this scenario, all parameters are *local* to the
>>> test.  Local parameters have the benefit that the test is self
>>> contained and doesn't need any external communication.
>>>
>>> In theory, this is also a limitation, because all parameters must be
>>> available before the test is started.  Even if other parameter system
>>> implementations are possible, with a local approach, there would be a
>>> number of limitations.  For long running tests, that may depend on
>>> parameters generated during the test, this would be a blocker.  Also,
>>> if a huge number of parameters would be available (think of a huge
>>> cloud or database of parameters) they would all have to be copied to
>>> the test at instantiation time.  Finally, it also means that the
>>> source of parameters would need to be able to iterate over all the
>>> available parameters, so that they can be copied, which can be a
>>> further limitation for cascading implementations.
>>>
>>> An external approach to parameters, would be one that the test holds a
>>> handle to a broker of parameter providers.  The parameter resolution
>>> would be done at run time.  This avoids the copying of parameters, but
>>> requires runtime communication with the parameter providers.  This can
>>> make the test execution much more fragile and dependent on the external
>>> communication.  Even by minimizing the number of communication
>>> endpoints by communicating with the test runner only, it can still add
>>> significant overhead, latency and point of failures to the test
>>> execution.
>>>
>>> I believe that, at this time, the limitations imposed by local
>>> parameter availability do *not* outweigh the problems that an external
>>> approach can bring.  In the future, if advanced use cases require an
>>> external approach to parameters availability, this can be reevaluated.
>>
>> If I understand your point correctly, this is an implementation
>> detail. It depends on what the "contract" is between the test runner
>> (parameter provider) and the test being run (the user of
>> params.get()).
> The only difference for the scope of this RFC I see is that the "lazy" params 
> can't be merged with "static" params before test execution (otherwise it'd 
> effectively made them static as well). This is quite important difference to 
> consider further in the design...
>


Exactly.  What we're leaning towards here is the simplest approach: all
parameters are "static" (as your call them) or "local" (as I called them).

>>
>> For example, should users assume parameters are dynamic and can
>> change during the lifetime of a test, an therefore two identical
>> calls to params.get() might return different values?  Should it be
>> possible to change params (something like params.put()) at runtime?
> Definitely no params changing.
> 

Agreed.

>>
>> (IMO the answer should be no to both questions).
>>
>> If you have something different in mind, then it would be
>> interesting to see some real use-cases.
>>
>>>
>>> Namespaces (AKA how/if should we merge parameters)
>>> ==================================================
>>>
>>> Currently, the parameter fetching interface already contains at its
>>> core the concept of paths[1].  In theory, this is sufficient to prevent
>>> clashes of keys with the same names, but intended to configure different
>>> aspects of a test.
>>>
>>> Now, with the proposed implementation of multiple providers to the
>>> parameter system, the question of how they will be combined comes up.
>>>
>>> One way is for each implementation to claim, based on some unique
>>> attribute such as its own name, a part of a tree path.  For instance,
>>> for two implementations:
>>>
>>>  1) variants
>>>  2) plain_ini
>>>
>>> Users could access parameters explicitly defined on those by referring
>>> to paths such as:
>>>
>>>    self.params.get('speed', path='/plain_ini', default=60)
>>>
>>> or
>>>
>>>    self.params.get('speed', path='/variants/path/inside/varianter',
>>> default=60)
>>>
>>> This clearly solves the clashes, but binds the implementation to the
>>> tests, which should be avoided at all costs.
>>
>> So you're providing this as an example of why it's a bad idea...
>> OK. :)
>>
> Yep, don't see this as a way to go. One should be able to execute the same 
> test with different providers (eg. json file with params generated by jenkins)
> 

Agreed.

>>>
>>> One other approach would be to merge everything into the tree root
>>> node.  By doing this, one parameter provider could effectively
>>> override the values in another parameter provider, given that both
>>> used the same paths for a number of parameters.
> This wouldn't be possible with "lazy" providers as it'd require iterating 
> through all params.
> 

Exactly.

>>>
>>> Yet another approach would be to *not* use paths, and resort to
>>> completely separate namespaces.  A parameter namespace would be an
>>> additional level of isolation, which can quickly become exceedingly
>>> complex.
> If I understood it correctly, this is what I'd go with. The way I envision 
> this is:
> 
> 1. varianter yields a variant, which also contains params in form of list of 
> namespaces associated with params `variant = [{"/run/variant": {"foo": 
> "bar"}},]`
> 2. runner attaches params from cmdline `cmdline = [{"/1": {"foo": "bar"}}, 
> {"/2": {"foo": "bar"}}]`
> 3. runner attaches params from avocado-virt plugin `virt = [{"/virt": 
> {"image_name": "foo"}]
> 4. runner executes test and passes params definition there
>     {"params": [("varianter", varianter), ("cmdline", cmdline), ("virt", 
> virt)],
>      "order": ["varianter", "__ALL_", "cmdline"]}
> 

Right, the general idea is sound.  Any reason why the runner itself
shouldn't prepare the *resulting* parameter data and pass it to the
test?  It's a matter of either passing data or code to the test.  I'd
rather pass data.

Think, for a second, about test replay.  If we pass parameter *data* to
the test, there's nothing else to solve here.  Serialization of that
data is trivial, and could even be a parameter provider itself (the only
or preferred one used at test replay time).

> Now the NewAvocadoParams object should create AvocadoParams per each provider 
> (params, cmdline, virt) being able to query params from any of them. The 
> process of getting params would get a bit more complicated for Avocado, but 
> for user nothing changes. Let's say user issues params.get("foo", "/*"):
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter is asked for a value, reports "bar", as this is a valid 
> response, no further params is queried and "bar" is returned
> 

I still find it much more deterministic to do that before the test kicks
in.  The test would have the resulting data for all the providers instead.

> Now for params.get("foo", "/1/*")
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter is asked and reports no match
> 3. __ALL__ => means ask providers without assigned priority, which applies to 
> "virt" in this example, which reports no match
> 4. cmdline is asked and returns "bar"
> 

Oh, special things are bad, and so is the "__ALL__" you're suggesting
here.  Also, just to be clear, do you mean special behavior to the
params generated by the varianter?

> And params.get("missing")
> 
> 1. NewAvocadoParasms looks at the "order"
> 2. varianter => no match
> 3. virt => no match
> 4. cmdline => no match
> 5. default is reported (None)
> 

Again, I fail to see why this shouldn't be done by the test runner instead.

> 
> The "order" is important, by default I'd suggest 
> varianter=>plugins=>params=>default_params (basically what Ademar suggested 
> and we all agreed many times), but it should be possible to set it if 
> necessary (eg. when user have multiple plugins and knows the order).
> 

This seems to indicate the varianter has a special role.  Can you
elaborate on that?

> Now in case of clash I believe it should report the clash even though further 
> providers might be able to report. Let's change the order to ["cmdline", 
> "__ALL__"] and query for params.get("foo", "/*")
> 
> 1. cmdline is asked, it raises ValueError("Leaves [/1, /2] contain key 
> "foo"\n  /1:foo:bar\n  /2:foo:bar") which is forwarded to the test.
> 
> This schema would work for "lazy" providers as it'd only be asked for 
> specific key and only when needed.
> 

The way I see this: clashes could be reported at merge time, as a
mega-uber-verbose option, but I'd refrain from it.  After that,
parameter data is clearly handed to a test.  The user running and
debugging the test, can clearly see the parameters available.  A verbose
mode can show something similar to the Avocado-VT cartesian config based
parameters.

- Cleber.

>>
>> I think using paths is confusing because it mixes concepts which are
>> exclusive to the multiplexer (a particular implementation of the
>> varianter) with an API that is shared by all other parameter
>> providers.
>>
>> For example, when you say "merge everything into the tree root
>> node", are you talking about namespace paths, or paths as used by
>> the multiplexer when the "!mux" keyword is present?
>> There is no difference between namespace path and path in multiplexer. The 
>> only difference is that multiplexer can provide a different slice, but the 
>> namespace is always the same, only not existing in some slices:
> 
>>>
>>> As can be seen from the section name, I'm not proposing one solution
>>> at this point, but hoping that a discussion on the topic would help
>>> achieve the best possible design.
>>
>> I think this should be abstract to the Test (in other words, not
>> exposed through any API). The order, priority and merge of
>> parameters is a problem to be solved at run-time by the test runner.
>>
>> All a test needs to "know" is that there's a parameter with the name
>> it wants.
>>
>> In the case of clashes, specifying a prioritization should be easy.
>> We could use a similar approach to how we prioritize Avocado's own
>> configuration.
>>
>> Example: from less important to top priorities, when resolving a
>> call to params.get():
>>
>>    * "default" value provided to params.get() inside the test code.
>>    * Params from /etc/avocado/global-variants.ini
>>    * Params from ~/avocado/global-variants.ini
>>    * Params from "--params=<file>"
>>    * Params from "--param=[prefix:]<key:value>"
>>    * Params from the variant generated from --mux-yaml=<file> (and
>>      using --mux-inject would have the same effect of changing
>>      <file> before using it)
>>
>> The key of this proposal is simplicity and scalability: it doesn't
>> matter if the user is running the test with the varianter, a simple
>> config file (--params=<file>) or passing some parameters by hand
>> (--param=key:value). The Test API and behavior are the same and the
>> users get a consistent experience.
> Yep. Basically there is a test writer and test user. Test writer should write 
> a test and ask for params. Test user should provide params which might also 
> include specific order, if needed (via cmdline or config).
> 
> Lukáš
> 
>>
>> Thanks.
>>    - Ademar
>>
>>> [1] -
>>> http://avocado-framework.readthedocs.io/en/52.0/api/core/avocado.core.html#avocado.core.varianter.AvocadoParams.get
>>
>>
>>
> 
> 

-- 
Cleber Rosa
[ Sr Software Engineer - Virtualization Team - Red Hat ]
[ Avocado Test Framework - avocado-framework.github.io ]
[  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]

signature.asc
Description: OpenPGP digital signature

Re: [Avocado-devel] Parameter System Overhaul

Reply via email to