On Wed, Sep 25, 2019 at 4:42 PM Cleber Rosa <cr...@redhat.com> wrote: > > On Wed, Sep 25, 2019 at 03:16:55PM +0200, Amador Pahim wrote: > > On Tue, Sep 24, 2019 at 9:42 PM Willian Rampazzo <wramp...@redhat.com> > > wrote: > > > > > > Hello everyone, > > > > > > I started working on Trello card > > > https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line > > > as part of a broader card, > > > https://trello.com/c/CKP7YS6G/1481-on-cache-check-for-asset-fetcher > > > and I would like to bring my findings to a discussion. > > > > > > One way to implement that would be to parse the test source looking > > > for the fetch_asset call and execute it. In theory, it is a straight > > > forward implementation. > > > > > > I have started working in the parser. After some simple tests, some > > > complex situations started to show up. Let me bring an existing > > > example, examples/tests/assets.py: > > > > > > def setUp(self): > > > mirrors = ['https://mirrors.peers.community/mirrors/gnu/hello/', > > > 'https://mirrors.kernel.org/gnu/hello/', > > > 'http://gnu.c3sl.ufpr.br/ftp/', > > > 'ftp://ftp.funet.fi/pub/gnu/prep/hello/'] > > > hello = 'hello-2.9.tar.gz' > > > hello_locations = ["%s/%s" % (loc, hello) for loc in mirrors] > > > hello_sig = 'hello-2.9.tar.gz.sig' > > > hello_sig_locations = ["%s/%s" % (loc, hello_sig) for loc in > > > mirrors] > > > self.hello = self.fetch_asset( > > > name=hello, > > > locations=hello_locations) > > > self.hello_sig = self.fetch_asset( > > > name=hello_sig, > > > asset_hash='f3b9fae20c35740004ae7b8de1301836dab4ac30', > > > locations=hello_sig_locations) > > > > > > When the parser finds the fetch_asset call, it needs to inspect its > > > arguments, looking for variables. If any variable is found, it needs > > > to walk back the code looking for its assignment, so it is able to > > > execute it prior to the fetch_asset execution. If the variables > > > consist of other variables, the parser, again, needs to walk back in > > > the code looking for those assignments, and so on. > > > > > > There are countless ways of creating the right side of a variable > > > assignment, from a single string assignment to a function that builds > > > the content, or conditional assignments. > > > > > > My initial idea is to cover variables consisting of other variables > > > and list comprehension, just like in the example. For now, other > > > complex constructions would be out of this implementation. > > > > > > Any comment, concern, suggestion, is appreciated here as it would help > > > to build a more robust code. > > > > The problem with not covering all the cases is that, well, you're not > > covering all the cases :) > > And the parsing of code to execute snippets out of it is always a > > tricky thing to do. What if your variable is being filled by a > > function call that acquires data from an external source that requires > > a specific environment... how to reproduce that environment in the > > command liner? There are way too many cases were things can go wrong. > > > > Have you considered adding support for args-from-config-file in the > > asset fetcher? Something like: > > > > asset.Asset(args_from_config='/foo/bar/my_asset_01.yaml') > > > > That's an interesting approach, in which the upside is the ease of > parsing the data, and the downside is that requiring developers > writing tests to use an external file. Requiring the "test" to be > split in multiple files may be a useful feature to scale a very large > test, but requiring it for all cases is not a good thing. We may find > out that, out of the viable options, that this turns out to be the > best one, but we should keep the cons in mind. > > Anyway, this actually reminds me of a older idea, on the topic of > parameters, in which we'd use the test's data directories to fetch > files containing parameters that would be applied by default to tests. > But, if we decide on the data format of those files, say YAML or JSON > (wether they contain parameters or asset definitions), we'd be limited > by the static nature of it. > > I remember thinking about that problem, and realizing that a > compromise may be to also allow for executable files that would > dynamic content. So, for test `my.py:My:test_asset`, we would look > for in its data sources directories: > > * my.py/My/test_asset.data/ > * my.py/My.data/ > * my.py.data/ > > Within any of the data directories we could have plugins that would > either read the asset files and produce parameters for asset.Asset(), > or plugins that would execute files and produce the same thing. For > instance: > > * my.py/My/test_asset.data/assets.json: > > {"name": "hello-2.9.tar.gz", > "locations": > ["https://mirrors.peers.community/mirrors/gnu/hello/hello-2.9.tar.gz", > "https://mirrors.kernel.org/gnu/hello/hello-2.9.tar.gz", > "http://gnu.c3sl.ufpr.br/ftp/hello-2.9.tar.gz", > "ftp://ftp.funet.fi/pub/gnu/prep/hello/hello-2.9.tar.gz"], > "asset_hash": "cb0470b0e8f4f7768338f5c5cfe1688c90fbbc74"} > > * my.py/My/test_asset.data/assets.py: > > #!/usr/bin/env python > import json > > name = "hello-2.9.tar.gz" > mirrors = ['https://mirrors.peers.community/mirrors/gnu/hello/', > 'https://mirrors.kernel.org/gnu/hello/', > 'http://gnu.c3sl.ufpr.br/ftp/', > 'ftp://ftp.funet.fi/pub/gnu/prep/hello/'] > locations = ["%s/%s" % (loc, name) for loc in mirrors] > > asset = {"name": name, > "locations": locations, > "asset_hash": "cb0470b0e8f4f7768338f5c5cfe1688c90fbbc74"} > print(json.dumps(asset))
I really like this. I'd specify some interface though, say importing my.py/My/test_asset.data/assets.py and calling its main(), expecting the json as the returned data, but I got your point and it looks like a very good way to go. A basic plugin that gets the file.json content from the data sources would be a nice first step. > > The executable option (my.py/My/test_asset.data/assets.py), while it > somehow violates the "don't execute *test* code before running the > test" principle, it's an opt-in feature, and still wouldn't be > executed while listing tests. When checking and downloading assets, > though, it would be executed. > > > Or, instead of a separate config file, I'd also consider some reserved > > keys in the parametrization interface (we have some already: > > https://avocado-framework.readthedocs.io/en/latest/WritingTests.html?highlight=timeout#setting-a-test-timeout). > > > > This is similar to: > > > https://trello.com/c/WPd4FrIy/1479-add-support-to-specify-assets-in-test-docstring > > But your suggestion is better in the sense that users could add > dynamic content in such a variable, and at the same time would limit > or make the the parsing of such content dificult before executing the > test IIUC. > > - Cleber. > > > That way you would enable the asset fetcher caching tool to read from > > the same configuration input and cache things in advance. Anyone > > willing to use that feature would have to either use the > > args_from_config also in their tests or specify the args accordingly. > > > > > > > > Best regards, > > > > > > Willian Rampazzo > > > Software Engineer > > > Red Hat Brazil > > > 2717 337F 7E4A 5FDF > > > > > > > > > -- > > Pahim > > -- Pahim