IMPALA-4340: explain how to install postgresql-9.5 or higher The random query generator needs to compare against PostgresQL 9.5 or higher to take advantage of some of the more recent features, especially as it pertains to Impala/Kudu INSERT and UPSERT queries. Developers will need assistance setting up their development environments if they need to use the random query generator.
This patch provides instructions on how to do so. We provide instructions, not automation, since this will have side-effects on developers' workstations: we can't presume to know how a developer might want to install or configure postgres, and we haven't tested on anything except our own development environment. Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9 Reviewed-on: http://gerrit.cloudera.org:8080/4846 Reviewed-by: Alex Behm <[email protected]> Tested-by: Tim Armstrong <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/7fc31b53 Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/7fc31b53 Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/7fc31b53 Branch: refs/heads/master Commit: 7fc31b534d4c5cb118c559e16556a6c1ae6ca7fc Parents: 5107669 Author: Michael Brown <[email protected]> Authored: Tue Oct 25 12:36:31 2016 -0700 Committer: Tim Armstrong <[email protected]> Committed: Mon Oct 31 23:18:55 2016 +0000 ---------------------------------------------------------------------- tests/comparison/POSTGRES.txt | 84 ++++++++++++++++++++++++++++++++++++++ tests/comparison/README | 4 ++ 2 files changed, 88 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/7fc31b53/tests/comparison/POSTGRES.txt ---------------------------------------------------------------------- diff --git a/tests/comparison/POSTGRES.txt b/tests/comparison/POSTGRES.txt new file mode 100644 index 0000000..af28240 --- /dev/null +++ b/tests/comparison/POSTGRES.txt @@ -0,0 +1,84 @@ +Motivation +---------- +To test Impala/Kudu: + 1. INSERT with ignorable primary key violations + 2. "UPSERT" + +we need PostgresQL 9.5 or higher. + +https://www.postgresql.org/docs/9.5/static/sql-insert.html + +PostgresQL 9.5+ Installation +---------------------------- + +Common GNU/Linux distributions won't necessarily have PostgresQL as high +as 9.5 available. Fortunately, PostgresQL understands this and provides +instructions for adding sources for both Debian- and Redhat-based +packages. + +https://wiki.postgresql.org/wiki/Apt +https://yum.postgresql.org/repopackages.php + +What follows are instructions for getting this working on Ubuntu 14.04. + +Use the instructions at https://wiki.postgresql.org/wiki/Apt + +Note: apt pinning is possible, but if you already installed postgres +from your standard GNU/Linux sources, you'll be forced to upgrade some +packages like postgresql-common anyway. Resolving those problems is out +of the scope of this document. + +Note: If you have postgresql-9.3 installed, your 9.3 installation will +get updated to the latest point release version. + +If you then install a new PostgresQL version, you might be prompted what +to do about /etc/postgresql-common/createcluster.conf . I elected to +keep the local version installed. + +Before you start your new version of PostgresQL, you will probably want +to make it easier to connect to the server: + + $ sudo perl -pi -e 's/(?:peer|md5)/trust/g' /etc/postgresql/9.5/main/pg_hba.conf + +Note: By default, PostgresQL servers only bind to localhost, so the +command above should only affect the localhost TCP server. + +Finally, you can 'sudo service postgresql start'. PostgresQL 9.3 is +still listening on port 5432, but note 9.5 is also running, on 5433. + +PostgresQL 9.5+ Usage +--------------------- + +Databases aren't shared across versions. You'll need to load data, e.g. +via: + + $ tests/comparison/data_generator.py \ + --use-postgresql \ + --postgresql-port 5433 # <---- important \ + [other options] + +... and use the random query generator in the same way: + + $ tests/comparison/discrepancy_searcher.py \ + --use-postgresql \ + --postgresql-port 5433 # <---- important \ + [other options] + +Testing +------- + +The following steps have been taking to validate postgresql-9.5 +9.5.4-1.pgdg14.04+2 on Ubuntu 14.04.2 LTS: + +1. Migrating tpch and tpch_kudu from Impala into PostgresQL 9.5 (via + 'data_generator.py migrate') +2. Loading random data into PostgresQL 9.5 (via 'data_generator.py populate') +3. Discrepancy tests between Impala and PostgresQL 9.5, and then again + on 9.3, to ensure results are the same. +4. Quick experiments to make sure ON CONFLICT DO NOTHING and ON CONFLICT + DO UPDATE actually work when connecting to Postgres on port 5433. + +Other Resources +--------------- +https://wiki.postgresql.org/wiki/Detailed_installation_guides +https://wiki.postgresql.org/wiki/Apt/FAQ http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/7fc31b53/tests/comparison/README ---------------------------------------------------------------------- diff --git a/tests/comparison/README b/tests/comparison/README index 579a8cf..d3e86df 100644 --- a/tests/comparison/README +++ b/tests/comparison/README @@ -22,6 +22,10 @@ Requirements: sudo apt-get install python-mysqldb sudo apt-get install python-psycopg2 # Postgresql +For Impala/Kudu CRUD random query generation and comparison, please see +the supplemental POSTGRES.txt on setting up PostgresQL 9.5 or higher as +a reference database. + Please see the supplemental ORACLE.txt on setting up Oracle as a reference database.
