Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10167 )

Change subject: IMPALA-6899: Optimize the HDFS commands used in dataload
......................................................................

IMPALA-6899: Optimize the HDFS commands used in dataload

HDFS commandline calls can be expensive due to JVM
startup and other costs. Since most HDFS commandline
calls can take multiple paths, one way to reduce
execution time is to consolidate multiple HDFS
commands into a single HDFS call. Since HDFS put
commands will follow symbolic links and can copy
recursively, this can allow for further consolidation
by creating the full directory structure and
copying it in a single HDFS call.

This does several of these optimizations throughout
the dataload codepath. It saves a few seconds here
and there:
Loading Hive Builtins: 1:10 -> 0:30
Loading custom schemas: 0:35 -> 0:20
Loading Hive UDFs: 0:45 -> 0:25

Conflicts:
testdata/bin/copy-udfs-udas.sh - conflict due to
"Loosen hive-exec.jar glob pattern..."

Change-Id: I0934353329dc7312394fc4457ab8db2a272c6282
Reviewed-on: http://gerrit.cloudera.org:8080/10120
Reviewed-by: Philip Zeyliger <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
(cherry picked from commit da363a99a4b1afff91600c71650e26932be9350a)
Reviewed-on: http://gerrit.cloudera.org:8080/10167
Reviewed-by: Joe McDonnell <[email protected]>
---
M testdata/bin/copy-udfs-udas.sh
M testdata/bin/create-load-data.sh
M testdata/bin/load-hive-builtins.sh
3 files changed, 131 insertions(+), 122 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I0934353329dc7312394fc4457ab8db2a272c6282
Gerrit-Change-Number: 10167
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>

Reply via email to