On Thu, Nov 16, 2017 at 10:13 AM Łukasz Gajowy <[email protected]> wrote:
> Hi all! > > We are currently working on the IO IT "test harness" that will allow to > run the IOITs on various runners, filesystems and with changing amount of > data. It is described in a doc some of you have probably seen and put > comments in the doc [1] (in the context of BEAM-3060 [2] task). > > Part of the job that in our opinion could be done is to somehow validate > the input parameters that are passed to the test. By saying "test input > parameters" I specifically mean: > - parameters passed as mvn's system properties, such as: > "-Dfilesystem=gcs" or "-DintegrationTestRunner=dataflow" > - parameters that are to be passed as PipelineOptions, such as > "numberOfRecords" or "filenamePrefix" in TextIO > > We imagine a situation when test parameters are passed in an incompatible > way, eg. someone can be willing to use the "dataflow" runner on some > filesystem that is unsupported there (say, s3). Running an IOIT with such > setup will most certainly fail. The crux of the idea is to inform the > developer early enough so that no such errors are made and test execution > time is saved. It eases debugging and avoids potential configuration errors. > > The doc [1] fully specifies what has to be validated. We'd like to > validate: > 1. runner + filesystem combination - both passed as system properties > 2. filesystem + pipeline options - as some additional options may be > required by specific FS. > 3. runner + pipeline options - as some additional options may be required > by specific runner > 4. IO test class instance (eg. TextIO) + options dedicated to it (eg. > numberOfRecords) > > We have an idea to write a small maven plugin for the validation. An > initial PoC to just view the concept is on GitHub [3]. The plugin would > essentialy be a small module in the ".test-infra" folder - the same as > other test-related stuff such as kubernetes scripts or jenkins files. We > should hide only the validation logic there - what gets validated should be > declaratively provided in pom.xml of each IOIT. All necessary validation > configuration could be passed in plugin's <configuration> section [4], [5]. > > Pros: > + input parameters validation is run very early - before compilation. We > don't waste time and wait for the tests to fail > + validation can be run depending on profile - we can append the plugin > to the profile's build section (eg. to an io-it profiles build section). > + validation is run once - after the first failure there is an exception > thrown which breaks the build (in case of running more than one test) > + (AFAIK) we could support multiple sdks in the future with this plugin - > the only condition is being able to start the mvn plugin in some initial > build phase > + (AFAIK) we can run it on gradle or adapt it to be a gradle task too > + every IO can have a set of validation rules defined in it's pom.xml > which is readable and clearly shows what combination can be done or not) > > Cons: > - everytime a new filesystem/runner etc is supported we have to update the > rules > - the plugin needs to be a separate module > > Concerns: > - is validating all the four points there a good idea? Perhaps the scope > of the things to validate in the plugin is too wide? As far as I know, > validation for runner specific PipelineOptions is already available. > > What do you think? > +1 for idea of moving some validation logic to the build tool (Maven/Gradle) so that tests fail early when when options are incorrect. We should not duplicate validation logic that is available elsewhere though (for example, runner + pipeline options) since this could lead to inconsistencies. I'm assuming that the solution will allow validation logic for a given test to reside in the Maven/Gradle sub-module of that test. Thanks, Cham > > Thanks, > Łukasz > > [1] > https://docs.google.com/document/d/1dA-5s6OHiP_cz-NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit#heading=h.84fcpdbcdqu > > [2] https://issues.apache.org/jira/browse/BEAM-3060 > [3] > https://github.com/lgajowy/beam/blob/validator-poc/.test-infra/testprops/src/main/java/org/apache/beam/testprops/TestPropsPlugin.java > [4] > https://github.com/lgajowy/beam/blob/validator-poc/sdks/java/io/jdbc/pom.xml#L46 > [5] > https://github.com/lgajowy/beam/blob/validator-poc/sdks/java/io/file-based-io-tests/pom.xml#L47 > >
