Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8822


Change subject: IMPALA-6070: Parallelize another bit of data load.
......................................................................

IMPALA-6070: Parallelize another bit of data load.

The two Kudu loads and Hive UDFs can all run in parallel. This
should shave about 4 minutes off of the data load. (Current
timings are 3.5, 4, and 0.6 minutes, see below.)

I've run dataload with this change many times.

   Loading Kudu functional (logging to 
/home/ubuntu/Impala/logs/data_loading/load-kudu.log)...
     Loading workload 'functional-query' using exploration strategy 'core' in 
table formats 'kudu/none/none' OK (Took: 3 min 29 sec)
   Loading Kudu TPCH (logging to 
/home/ubuntu/Impala/logs/data_loading/load-kudu-tpch.log)...
     Loading workload 'tpch' using exploration strategy 'core' in table formats 
'kudu/none/none' OK (Took: 4 min 0 sec)
   Loading Hive UDFs (logging to 
/home/ubuntu/Impala/logs/data_loading/build-and-copy-hive-udfs.log)...
     Loading Hive UDFs OK (Took: 0 min 41 sec)

Change-Id: I7e93ee5a77ec9271b980b88bef7ad512ecbe0407
---
M testdata/bin/create-load-data.sh
1 file changed, 4 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8822/1
--
To view, visit http://gerrit.cloudera.org:8080/8822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e93ee5a77ec9271b980b88bef7ad512ecbe0407
Gerrit-Change-Number: 8822
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>

Reply via email to