Hey guys, I put up a small project on GitHub [1] with Hive metastore dumps from tpcds10tb/tpcds30tb (+partitioning) and some scripts to quickly spin up a dockerized Postgres with those loaded.
Personally, I find it useful to check the plans of TPC-DS queries using the usual qtest mechanism (without external tools and tapping into a real cluster) having at hand beefy stats + partitioning info. The driver and other changes needed to run these tests are located in [2]. I am sharing it here in case it might be of use to somebody else. The two main commands that you will need if you wanna try this out: docker build --tag postgres-tpcds-metastore:1.0 . mvn test -Dtest=TestTezPerfDBCliDriver -Dtest.output.overwrite=true -Dtest.metastore.db=postgres.tpcds Small caveat: Currently in [2] the dockerized postgres is restarted for every query which makes things slow. This will be fixed later on. Best, Stamatis [1] https://github.com/zabetak/hive-postgres-metastore [2] https://github.com/zabetak/hive/tree/qtest_postgres_driver