Integration tests is a great idea. Yea I guess going forward a direct git to download templates may be a viable option.
Simon On Friday, July 22, 2016, Donald Szeto <[email protected]> wrote: > Hey guys, > > This proposal of adding integration tests is awesome! > > Echoing Xusen, I recall Pat suggested we could remove pio template get, so > it should be okay to just git clone the template from somewhere. I think > Marcin is including the template now because currently templates still use > artifacts under the old io.prediction package namespace. > > Regards, > Donald > > On Friday, July 22, 2016, Xusen Yin <[email protected] <javascript:;>> > wrote: > > > Hi Marcin, > > > > Personally I vote for adding integration tests. Thanks for the proposal. > > One suggestion is about the test scenarios. IMHO there is no need to add > > the recommendation-engin template inside the predictionio codebase. Why > not > > use pio template to download them from Github when testing? > > > > Another concern is the docker pull ziemin/pio-testing in the travis > config > > file. And there is also a testing/Dockerfile which starts from ubuntu. > So I > > think either we should use docker pull ubuntu, or we use a pre-built > > testing Docker image instead of the Dockerfile. > > > > Best > > Xusen Yin > > > > > On Jul 22, 2016, at 2:52 PM, Marcin Ziemiński <[email protected] > <javascript:;> > > <javascript:;>> wrote: > > > > > > Hi! > > > > > > I have a feeling that PredictionIO is lacking integration tests. > TravisCI > > > is executed only on unit tests residing in the repository. Not only > > better > > > tests are important for keeping quality of the project, but also for > the > > > sheer comfort of development. Therefore, I tried to come up with some > > > simple basis for adding and building tests. > > > > > > - Integration tests should be agnostic to environment settings (it > > > should not matter whether we use Postgres or HBase) > > > - They should be easy to run for developers and the configuration > > should > > > not pollute their working space > > > > > > I have pushed a sequence of commits to my personal fork and ran travis > > > builds on them - Diff with upstream > > > < > > > https://github.com/apache/incubator-predictionio/compare/develop...Ziemin:testing-infrastructure > > > > > > > > > The following changes were introduced: > > > > > > - Dedicated Docker image was prepared. This image fetches and > prepares > > > some possible dependencies for PredictionIO - postgres, hbase, spark, > > > elasticsearch. > > > Upon container initialization all services are started including > spark > > > standalone cluster. The best way to start it is to use > > > testing/run_docker.sh script, which binds relevant ports, mounts > shared > > > directories with ivy2 cache and PredictionIO's code repository. More > > > importantly it sets up pio's configuration, e.g.: > > > $ /run_docker.sh PGSQL HBASE HDFS ~/projects/incubator-predictionio > > > '/pio_host/testing/simple_scenario/run_scenario.sh' > > > This command should set metadata repo to PGSQL, event data to HBASE > and > > > model data to HDFS. The last two arguments are path to repo and a > > command > > > to run from inside the container. > > > An important thing to note is that container expects a tar with the > > > built distribution to be found in shared /pio_host directory, which > is > > > later unpacked. > > > User can then safely execute all pio ... commands. By default > container > > > pop up a bash shell if not given any other commands. > > > - Currently there is only one simple test added, which is just a copy > > of > > > the steps mentioned in the quickstart tutorial. > > > - .travis.yml was modified to run 4 concurrent builds: one for unit > > > tests as previously and three integration tests for various > > combinations of > > > services > > > env: > > > global: > > > - PIO_HOME=`pwd` > > > > > > matrix: > > > - BUILD_TYPE=Unit > > > - BUILD_TYPE=Integration METADATA_REP=PGSQL EVENTDATA_REP=PGSQL > > > MODELDATA_REP=PGSQL > > > - BUILD_TYPE=Integration METADATA_REP=ELASTICSEARCH > > > EVENTDATA_REP=HBASE MODELDATA_REP=LOCALFS > > > - BUILD_TYPE=Integration METADATA_REP=ELASTICSEARCH > > > EVENTDATA_REP=PGSQL MODELDATA_REP=HDFS > > > Here you can find the build logs: travis logs > > > < > https://travis-ci.org/Ziemin/incubator-predictionio/builds/146753806> > > > What is more, to make build times shorter, ivy jars are cached on > > travis > > > now, so that they are included faster in subsequent tests. > > > > > > The current setup let developers have an easy way to run tests for > > > different environment settings in a deterministic way, as well as use > > > travis or other CI tools in more convenient way. What is left to do now > > is > > > to prepare a sensible set of different tests written in a concise and > > > extensible way. I think that ideally we could use python API and add a > > > small library to it focused strictly on our testing purposes. > > > > > > Any insights would be invaluable. > > > > > > > > > Regards, > > > > > > -- Marcin > > > > >
