I also wrote up this dev doc that goes into more depth on how this will all work, as well as what it will be like to create a new IO IT.
https://docs.google.com/document/d/1fISxgeq4Cbr-YRJQDgpnHxfTiQiHv8zQgb47dSvvJ78/edit?usp=sharing S On Wed, Jul 5, 2017 at 3:11 PM Stephen Sisk <[email protected]> wrote: > hey all, > > I wanted to share an early draft of what it'll be like to invoke mvn for > the IO integration tests in the future when we have the integration with > kubernetes going. > > I'm really excited about these changes - working on the IO ITs, I have to > run them frequently, and the command lines to run them can be quite a bear. > For example: > > mvn -e verify -Dit.test=org.apache.beam.sdk.io.jdbc.JdbcIOIT > -DskipITs=false -pl sdks/java/io/jdbc -Pio-it -Pdataflow-runner > -DintegrationTestPipelineOptions=["--project=[project]","--gcpTempLocation=gs://[bucket]/staging","--postgresUsername=postgres","--postgresPassword=uuinkks","--postgresDatabaseName=postgres","--postgresSsl=False","--postgresServerName=[1.2.3.4]","--runner=TestDataflowRunner","--defaultWorkerLogLevel=INFO"] > > Also, in order to run this, I first need to have created an instance of > this datastore in kubernetes and then copied the parameter and inevitably I > mis-copy something in there or something changes, so it doesn't work > correctly and I have to go back in and edit it. > > So that's a pain. > > To invoke the IO ITs in the future, it'll be a command like this: > mvn verify -Pio-it-suite -pl sdks/java/io/jdbc > -DpkbLocation="path-to-pkb.py" \ > -DintegrationTestPipelineOptions='["--tempRoot=my-temp-root"]' > (or at least, that's what I'm proposing :) > > This will run the jdbc integration tests, spinning up the data store for > that integration test in your kubernetes cluster. > > This is all enabled by a combination of adding new profiles in maven for > each IO and changes to the beam benchmarks in pkb (perfkitbenchmarker) to > control kubernetes. Jason has already done a lot of work to get pkb working > to run our regular benchmarks, and I'm excited to continue that work for IO > ITs. We use pkb to control kubernetes and capture our benchmark times. This > means you'll need to install pkb if you'd like to use this nicer > experience, however, devs will never have to use pkb if they don't want to, > nor is making changes in pkb required when you want to add a new IO IT. You > can always spin up the data store yourself, and invoke the integration test > directly. > > Drafts of these changes can be seen at [0] and [1] - however, I don't > expect most folks will care about these changes other than "how do I invoke > this?", so let me know if you have comments about how this is invoked. > > S > > [0] pom changes hooking up the call to pkb - > https://github.com/ssisk/beam/commit/eec7cb5b71330761e71850e8e6f65f34249641b0 > [1] pkb changes enabling kubernetes spin up- > https://github.com/ssisk/PerfKitBenchmarker/commits/kubernetes_create > (last 2 changes) >
