Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/23092


Change subject: IMPALA-14697: Generate TPC-H/TPC-DS data in parallel
......................................................................

IMPALA-14697: Generate TPC-H/TPC-DS data in parallel

For performance test jobs, generating the datasets at higher
scales can take significant time. Currently, this is running
a single invocation of the generator binary in a single
thread. This goes parallel by generating each table in a
separate thread. There are a limited number of tables and many
of the tables are small, so this doesn't control the level
of parallelism.

Testing:
 - Ran core job
 - Loaded TPC-H scale 42
 - Loaded TPC-DS scale 20

Change-Id: I7ba13e2275be2ac1a5ae8f9354c947d9f1adf263
---
M testdata/datasets/tpcds/preload
M testdata/datasets/tpch/preload
2 files changed, 118 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/23092/3
--
To view, visit http://gerrit.cloudera.org:8080/23092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7ba13e2275be2ac1a5ae8f9354c947d9f1adf263
Gerrit-Change-Number: 23092
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell <[email protected]>

Reply via email to