On Thu, Nov 16, 2017 at 10:13 AM Łukasz Gajowy <[email protected]>
wrote:

> Hi all!
>
> We are currently working on the IO IT "test harness" that will allow to
> run the IOITs on various runners, filesystems and with changing amount of
> data. It is described in a doc some of you have probably seen and put
> comments in the doc [1] (in the context of BEAM-3060 [2] task).
>
> Part of the job that in our opinion could be done is to somehow validate
> the input parameters that are passed to the test. By saying "test input
> parameters" I specifically mean:
>  - parameters passed as mvn's system properties, such as:
> "-Dfilesystem=gcs" or "-DintegrationTestRunner=dataflow"
>  - parameters that are to be passed as PipelineOptions, such as
> "numberOfRecords" or "filenamePrefix" in TextIO
>
> We imagine a situation when test parameters are passed in an incompatible
> way, eg. someone can be willing to use the "dataflow" runner on some
> filesystem that is unsupported there (say, s3). Running an IOIT with such
> setup will most certainly fail. The crux of the idea is to inform the
> developer early enough so that no such errors are made and test execution
> time is saved. It eases debugging and avoids potential configuration errors.
>
> The doc [1] fully specifies what has to be validated. We'd like to
> validate:
> 1. runner + filesystem combination  - both passed as system properties
> 2. filesystem + pipeline options - as some additional options may be
> required by specific FS.
> 3. runner + pipeline options  - as some additional options may be required
> by specific runner
> 4. IO test class instance (eg. TextIO) + options dedicated to it (eg.
> numberOfRecords)
>
> We have an idea to write a small maven plugin for the validation. An
> initial PoC to just view the concept is on GitHub [3]. The plugin would
> essentialy be a small module in the ".test-infra" folder - the same as
> other test-related stuff such as kubernetes scripts or jenkins files. We
> should hide only the validation logic there - what gets validated should be
> declaratively provided in pom.xml of each IOIT. All necessary validation
> configuration could be passed in plugin's <configuration> section [4], [5].
>
> Pros:
>  + input parameters validation is run very early - before compilation. We
> don't waste time and wait for the tests to fail
>  + validation can be run depending on profile - we can append the plugin
> to the profile's build section (eg. to an io-it profiles build section).
>  + validation is run once - after the first failure there is an exception
> thrown which breaks the build (in case of running more than one test)
>  + (AFAIK) we could support multiple sdks in the future with this plugin -
> the only condition is being able to start the mvn plugin in some initial
> build phase
>  + (AFAIK) we can run it on gradle or adapt it to be a gradle task too
>  + every IO can have a set of validation rules defined in it's pom.xml
> which is readable and clearly shows what combination can be done or not)
>
> Cons:
> - everytime a new filesystem/runner etc is supported we have to update the
> rules
> - the plugin needs to be a separate module
>
> Concerns:
>  - is validating all the four points there a good idea? Perhaps the scope
> of the things to validate in the plugin is too wide? As far as I know,
> validation for runner specific PipelineOptions is already available.
>
> What do you think?
>

+1 for idea of moving some validation logic to the build tool
(Maven/Gradle) so that tests fail early when when options are incorrect.

We should not duplicate validation logic that is available elsewhere though
(for example, runner + pipeline options) since this could lead to
inconsistencies.

I'm assuming that the solution will allow validation logic for a given test
to reside in the Maven/Gradle sub-module of that test.

Thanks,
Cham


>
> Thanks,
> Łukasz
>
> [1]
> https://docs.google.com/document/d/1dA-5s6OHiP_cz-NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit#heading=h.84fcpdbcdqu
>
> [2] https://issues.apache.org/jira/browse/BEAM-3060
> [3]
> https://github.com/lgajowy/beam/blob/validator-poc/.test-infra/testprops/src/main/java/org/apache/beam/testprops/TestPropsPlugin.java
> [4]
> https://github.com/lgajowy/beam/blob/validator-poc/sdks/java/io/jdbc/pom.xml#L46
> [5]
> https://github.com/lgajowy/beam/blob/validator-poc/sdks/java/io/file-based-io-tests/pom.xml#L47
>
>

Reply via email to