I also wrote up this dev doc that goes into more depth on how this will all
work, as well as what it will be like to create a new IO IT.

https://docs.google.com/document/d/1fISxgeq4Cbr-YRJQDgpnHxfTiQiHv8zQgb47dSvvJ78/edit?usp=sharing


S

On Wed, Jul 5, 2017 at 3:11 PM Stephen Sisk <[email protected]> wrote:

> hey all,
>
> I wanted to share an early draft of what it'll be like to invoke mvn for
> the IO integration tests in the future when we have the integration with
> kubernetes going.
>
> I'm really excited about these changes - working on the IO ITs, I have to
> run them frequently, and the command lines to run them can be quite a bear.
> For example:
>
> mvn -e verify -Dit.test=org.apache.beam.sdk.io.jdbc.JdbcIOIT
> -DskipITs=false -pl sdks/java/io/jdbc -Pio-it -Pdataflow-runner
> -DintegrationTestPipelineOptions=["--project=[project]","--gcpTempLocation=gs://[bucket]/staging","--postgresUsername=postgres","--postgresPassword=uuinkks","--postgresDatabaseName=postgres","--postgresSsl=False","--postgresServerName=[1.2.3.4]","--runner=TestDataflowRunner","--defaultWorkerLogLevel=INFO"]
>
> Also, in order to run this, I first need to have created an instance of
> this datastore in kubernetes and then copied the parameter and inevitably I
> mis-copy something in there or something changes, so it doesn't work
> correctly and I have to go back in and edit it.
>
> So that's a pain.
>
> To invoke the IO ITs in the future, it'll be a command like this:
>   mvn verify -Pio-it-suite -pl sdks/java/io/jdbc
>       -DpkbLocation="path-to-pkb.py" \
>       -DintegrationTestPipelineOptions='["--tempRoot=my-temp-root"]'
> (or at least, that's what I'm proposing :)
>
> This will run the jdbc integration tests, spinning up the data store for
> that integration test in your kubernetes cluster.
>
> This is all enabled by a combination of adding new profiles in maven for
> each IO and changes to the beam benchmarks in pkb (perfkitbenchmarker) to
> control kubernetes. Jason has already done a lot of work to get pkb working
> to run our regular benchmarks, and I'm excited to continue that work for IO
> ITs. We use pkb to control kubernetes and capture our benchmark times. This
> means you'll need to install pkb if you'd like to use this nicer
> experience, however, devs will never have to use pkb if they don't want to,
> nor is making changes in pkb required when you want to add a new IO IT. You
> can always spin up the data store yourself, and invoke the integration test
> directly.
>
> Drafts of these changes can be seen at [0] and [1] - however, I don't
> expect most folks will care about these changes other than "how do I invoke
> this?", so let me know if you have comments about how this is invoked.
>
> S
>
> [0] pom changes hooking up the call to pkb -
> https://github.com/ssisk/beam/commit/eec7cb5b71330761e71850e8e6f65f34249641b0
> [1] pkb changes enabling kubernetes spin up-
> https://github.com/ssisk/PerfKitBenchmarker/commits/kubernetes_create
> (last 2 changes)
>

Reply via email to