Re: [Avocado-devel] RFC: Configuration by convention

Cleber Rosa Tue, 03 Dec 2019 11:15:39 -0800

On Thu, Nov 28, 2019 at 04:00:17PM +0100, Lukáš Doktor wrote:
> Dne 21. 11. 19 v 22:23 Beraldo Leal napsal(a):
> > Hi all,
> > 
> 
> Hello Beraldo,
> 
> I do like (ideally written) conventions as far as they don't block us.
> 
> > I am working on a card about "Configuration by convention", and I realized 
> > that
> > it would be better to consult the list first, regarding few key points.
> > 
> > So I would like to share with you this RFC and get your feedbacks.
> > 
> > TL;DR
> > #####
> > 
> > The number of plugins made by many people and the lack of some name, config
> > options, and argument type conventions may turn Avocado's usability 
> > difficult.
> > This also makes it challenging to create a future API for executing more
> > complex jobs. I would like to discuss in this RFC some proposals to improve
> > this.
> > 
> > And note that, since this is a relatively big change, this RFC, if agreed,
> > could be broken down into smaller issues to facilitate its acceptance into 
> > the
> > master branch.
> > 
> > Motivation
> > ##########
> > 
> > An Avocado Job is primarily executed through the `avocado run` command line.
> > The behavior of such an Avocado Job is determined by parsing the following
> > settings (listed in parsed order):
> > 
> >  1) Default values in source code
> >  2) Configuration file contents
> >  3) Command-line options
> > 
> 
> I'm missing in this RFC some kind of mapping the above. I think if we are to 
> do those intrusive changes, we should spend some time on specifying the 
> relations. (but maybe I just missed it)
>


I'm assuming that by mapping you mean the exact convention that would
be implemented from a command-line option to a configuration file
content to default value name.  Then, yes, we need clearer definition,
and I guess Beraldo intends to do that in the "blueprint" document.

The consequences on user experience, including deprecation and
migration plans were briefly raised at the "Backwards
Comaptibility" section.

> > Currently, the Avocado config file is an .ini file that is parsed by 
> > Python's
> > `configparser` library and this config is broken into sections. Each Avocado
> > plugin has its dedicated section.
> > 
> > Today, the parsing of the command line options is made by `argparse` library
> > and produces a dictionary that is given to the `avocado.core.job.Job()` 
> > class
> > as its `config` parameter.
> > 
> > There is no convention on the naming pattern used either on configuration 
> > files
> > or on command-line options. Besides the name convention, there is also a 
> > lack
> > of convention for some argument types. For instance::
> > 
> >  $ avocado run -d
> > 
> > and::
> > 
> >  $ avocado run --sysinfo on
> > 
> > Both are boolean variables, but with different "execution model" (the former
> > doesn't need arguments and the latter needs `on` or `off` as argument).
> > 
> 
> Actually we do follow the pattern for booleans. The "--sysinfo" is a 
> tri-state.
>
> > Since the Avocado trend is to have more and more plugins, we need to design 
> > a
> > name convention on command-line arguments and settings to avoid chaos.
> > 
> > But, most important: It would be valuable for our users if Avocado provides 
> > a
> > Python API in such a way that developers could write more complex jobs
> > programmatically and advanced users that know the configuration entries 
> > used on
> > jobs, could do a quick one-off execution on command-line.
> > 
> > Example::
> > 
> >  import sys
> >  from avocado.core.job import Job
> > 
> >  config = {'references': ['tests/passtest.py:PassTest.test']}
> > 
> >  with Job(config) as j:
> >    sys.exit(j.run())
> > 
> > Before we address this API use-case, it is important to create this 
> > convention
> > so we can have an intuitive use of Avocado config options.
> > 
> > .. note:: We understand that, plugin developers have the flexibility to
> >           configure they options as desired but inside Avocado core and 
> > plugin,
> >           settings should have a good naming convention.
> > 
> > 
> > Specification
> > #############
> > 
> > 
> > Standards for Command Line Interface
> > ------------------------------------
> > 
> > When it comes to the command line interface, a very interesting 
> > recommendation
> > is the POSIX Standard's recommendation for arguments[1]. Avocado should try 
> > to
> > follow this standard and its recommendations.
> > 
> > This pattern does not cover long options (starting with --). For this, we 
> > should
> > also embrace the GNU extension[2].
> > 
> > One of the goals of this extension, by introducing long options, was to make
> > command-line utilities user-friendly. Also, another aim was to try to 
> > create a
> > norm among different command-line utilities. Thus, --verbose, --debug,
> > --version (with other options) would have the same behavior in many 
> > programs.
> > Avocado should try to, where applicable, use the GNU long options table[3] 
> > as
> > reference.
> > 
> > Many of these recommendations are obvious and already used by Avocado or
> > enforced by default, thanks to libraries like `argparse`.
> > 
> > However, those libraries do not force the developer to follow all
> > recommendations.
> > 
> > Besides the basic ones, here are some recommendations we should try to 
> > follow
> > and pay attention to:
> > 
> >   1. Option-arguments should not be optional (Guideline 7, from POSIX). So 
> > we
> >      should avoid this::
> >      
> >         avocado run --loaders [LOADERS [LOADERS ...]]
> > 
> 
> Well you might want to specify no loaders (to override the default), although 
> the only usecase I see is self-testing. But how about:
> 
>     avocado run --loaders LOADERS [LOADERS ...]
> 
> is that acceptable?
> 
> >   or::
> >   
> >         avocado run --store-logging-stream [STREAM[:LEVEL] [STREAM[:LEVEL] 
> > ...]]
> > 
> >      We can have::
> > 
> >         avocado run --loaders LOADER,LOADER,LOADER,...
> 
> ^^ Inventing another separator usually leads to non-systematic escaping
> 
> > 
> >      or::
> > 
> >         avocado run --loader LOADER --loader LOADER --loader LOADER
> 
> ^^ this one is really verbose
> 
> I dislike both proposed, the:
> 
>     avocado run --loaders LOADERS [LOADERS ...]
> 
> is well supported and widely used by other programs. We can argue about 
> "nargs=*" but IMO it sometimes makes sense (when we do want to accept empty 
> sets, like filters...)
> 
> > 
> >   2. Use hyphens not underscore: Long options consist of ‘--’ followed by a
> >      name made of alphanumeric characters and dashes. Option names are
> >      typically one to three words long, with hyphens to separate words. 
> > Users
> >      can abbreviate the option names as long as the abbreviations are 
> > unique.
> >      Also, underscore, sometimes it gets "eaten" by a terminal border and
> >      thus looks like space.
> > 
> 
> sure "[a-z-]*" works for me for long options. As for short "-" options it's 
> useful to extend it to "[a-zA-Z]" eg. to enable/disable an option.
> 
> >   3. When naming subcommands options you don’t have to worry about name
> >      conflicts outside the subcommand scope, just keep them short, simple 
> > and
> >      intuitive.
> > 
> > Argument Types
> > ~~~~~~~~~~~~~~
> > 
> > Basic types, like strings and integers, are clear how to use. But here is a
> > list of what should expect when using other types:
> > 
> >   1. **Booleans**: Boolean options should be expressed as "flags" args 
> > (without
> >        the "option-argument"). Flags, when present, should represent a
> >        True/Active value.  This will reduce the command line size. We should
> >        avoid using this::
> > 
> >         avocado run --json-job-result {on,off}
> > 
> >   2. **Lists**: When an option argument has multiple values we should use 
> > the
> >        space as the separator.
> > 
> 
> This basically means:
> 
>     avocado run --loaders LOADERS [LOADERS ...]
> 
> right?
> 
> > 
> > Presentation
> > ~~~~~~~~~~~~
> > 
> > Finding options easily, either in the manual or in the help, favor usability
> > and avoids chaos.
> > 
> > We can arrange the display of these options in alphabetical order within 
> > each
> > section.
> > 
> 
> I'd love to (more-less), but sometimes people forget. It's hard to enforce 
> this. Also there are exceptions where we want to make some options more 
> visible, but in majority cases it should be A-Z.
>

Yes, this is ideal... the tricky question is how and at what
(development) cost.

> > 
> > Standards for Config File Interface
> > -----------------------------------
> > 
> > .. note:: Many other config file options could be used here, but since that
> >           this is another discussion, I'm assuming that we are going to keep
> >           using `configparser` for a while.
> > 
> > As one of the main motivations of this RFC is to create a convention to 
> > avoid
> > chaos and make the job execution API use as straightforward as possible, I
> > believe that the config file should be as close as possible to the 
> > dictionary
> > that will be passed to this API.
> > 
> > For this reason, this may be the most critical point of this RFC. We should
> > create a pattern that is intuitive for the developer to convert from one 
> > format
> > to another without much juggling.
> > 
> > Nested Sections
> > ~~~~~~~~~~~~~~~
> > 
> > While the current `configparser` library does not support nested sections,
> > Avocado can use the dot character as a convention for that. i.e:
> > `[runner.output]`.
> > 
> > This convention will be important soon, when converting a dictionary into a
> > config file and vice-versa.
> > 
> 
> This is the only mentioning of args->config mapping. Can you please elaborate 
> a bit more?
> 
> > And since almost everything in Avocado is a plugin, each plugin section 
> > should
> > **not** use the "plugins" prefix and **must** respect the reserved sections
> > mentioned before. Currently, we have a mix of sections that start with
> > "plugins" and sections that don't.
> > 
> 
> So basically
> 
> [vt]
> 
> vt-related-option
> 
> [vt.generic]
> 
> generic-vt-related-option
> 
> [runner]
> 
> runner-related-option
> 
> 
> yes, the plugins section seems redundant as many parts are actually 
> implemented as plugins.
>

Yes, agreed.  The "plugin" suffix can go.

> > Plugin section name
> > ~~~~~~~~~~~~~~~~~~~
> > 
> > I am not quite sure here and would like to know the opinion of those who are
> > the longest in the project. Perhaps this is a little controversial point. 
> > But I
> > believe we can touch here to improve our convention.
> > 
> > Most plugins currently have the same name as the python module. Example: 
> > human,
> > diff, tap, nrun, run, journal, replay, sysinfo, etc.
> > 
> > These are examples of "good" names.
> > 
> > However, some other plugins do not follow this convention. Ex: runnable_run,
> > runnable_run_recipe, task_run, task_run_recipe, archive, etc.
> > 
> > I believe that having a convention here helps when writing more complex 
> > tests,
> > configfiles, as well as easily finding plugins in various parts of the 
> > project,
> > either on a manual page or during the installation procedure.
> > 
> > I understand that the name of the plugin is different from the module name 
> > in
> > python, but anyway, should we follow PEP8 in this case?
> > 
> >         From PEP8: Modules should have short, all-lowercase names. 
> > Underscores
> >         can be used in the module name if it improves readability. Python
> >         packages should also have short, all-lowercase names, although the 
> > use
> >         of underscores is discouraged.
> > 
> 
> I'm not sure I understand properly this section. Can you please elaborate a 
> bit more? Is the "_" -> "-" the problem you want to avoid?.
> 
> > Reserved Sections
> > ~~~~~~~~~~~~~~~~~
> > 
> > We should reserve a few sections as reserved for the Avocado's core
> > functionalities. i.e: main, plugins, logs, job, etc...
> > 
> > Not sure here, it makes sense?
> > 
> 
> If we are to remove the "plugins." namespace then yes, we should reserve some 
> names. At least "core" to indicate core options, or all above (plus perhaps 
> some other core parts).
>

How can we tell if we have reserved *enough* sections?  If know that we
need a section such as "logs", and use it, this is a de-facto reservation.
What worries me is a preventive reservation because they will be probably
speculative.  In a programming language, reserved words have a use, and
thus variables and other statements can't use it.  But image a reserved
word that is never used...

- Cleber.

Re: [Avocado-devel] RFC: Configuration by convention

Reply via email to