[1/2] incubator-impala git commit: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

mikeb Thu, 02 Nov 2017 09:55:31 -0700

Repository: incubator-impala
Updated Branches:
  refs/heads/master 10385c11a -> 98490925c



IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

This is a revert of a revert, re-enabling parallel data load.  It avoid
the race condition by explicitly configuring the temporary directory in
question in load-data.py.

When the parallel data load change went in, we discovered
a race with a signature of:

  java.io.FileNotFoundException: File
  /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist

The number in this path is milliseconds since the epoch, and the race
occurs when two queries submitted to HiveServer2, running with the local
runner, hit the same millisecond time stamp.  The upstream bug is
https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
is now marked as a dupe).

I've tested this by running data load 5 times on the same machines
where it failed before. I also ran data load manually and inspected
the system to make sure that the temporary directories are getting
created as expected in /tmp/impala-data-load-*.

Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
Reviewed-on: http://gerrit.cloudera.org:8080/8405
Reviewed-by: Tim Armstrong <[email protected]>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/76111ce1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/76111ce1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/76111ce1

Branch: refs/heads/master
Commit: 76111ce168c25a7882a13705963dc3c7118121a3
Parents: 10385c1
Author: Philip Zeyliger <[email protected]>
Authored: Wed Oct 25 16:38:22 2017 -0700
Committer: Impala Public Jenkins <[email protected]>
Committed: Thu Nov 2 00:40:19 2017 +0000

----------------------------------------------------------------------
 bin/load-data.py                 | 16 +++++++++++++++-
 testdata/bin/create-load-data.sh | 11 ++++++++---
 testdata/bin/run-hive-server.sh  |  2 +-
 testdata/bin/run-step.sh         | 36 ++++++++++++++++++++++++++++++++++-
 4 files changed, 59 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/76111ce1/bin/load-data.py
----------------------------------------------------------------------
diff --git a/bin/load-data.py b/bin/load-data.py
index 7b2ab23..ede604a 100755
--- a/bin/load-data.py
+++ b/bin/load-data.py
@@ -109,7 +109,21 @@ if options.use_kerberos:
   hive_auth = "principal=" + options.principal
 
 HIVE_ARGS = '-n %s -u "jdbc:hive2://%s/default;%s" --verbose=true'\
-      % (getpass.getuser(), options.hive_hs2_hostport, hive_auth)
+    % (getpass.getuser(), options.hive_hs2_hostport, hive_auth)
+
+# When HiveServer2 is configured to use "local" mode (i.e., MR jobs are run
+# in-process rather than on YARN), Hadoop's LocalDistributedCacheManager has a
+# race, wherein it tires to localize jars into
+# /tmp/hadoop-$USER/mapred/local/<millis>. Two simultaneous Hive queries
+# against HS2 can conflict here. Weirdly LocalJobRunner handles a similar issue
+# (with the staging directory) by appending a random number. To over come this,
+# in the case that HS2 is on the local machine (which we conflate with also
+# running MR jobs locally), we move the temporary directory into a unique
+# directory via configuration. This block can be removed when
+# https://issues.apache.org/jira/browse/MAPREDUCE-6441 is resolved.
+if options.hive_hs2_hostport.startswith("localhost:"):
+  HIVE_ARGS += ' --hiveconf "mapreduce.cluster.local.dir=%s"' % 
(tempfile.mkdtemp(
+    prefix="impala-data-load-"))
 
 HADOOP_CMD = os.path.join(os.environ['HADOOP_HOME'], 'bin/hadoop')
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/76111ce1/testdata/bin/create-load-data.sh
----------------------------------------------------------------------
diff --git a/testdata/bin/create-load-data.sh b/testdata/bin/create-load-data.sh
index 8640ded..c5207a9 100755
--- a/testdata/bin/create-load-data.sh
+++ b/testdata/bin/create-load-data.sh
@@ -449,9 +449,15 @@ fi
 
 if [ $SKIP_METADATA_LOAD -eq 0 ]; then
   run-step "Loading custom schemas" load-custom-schemas.log load-custom-schemas
-  run-step "Loading functional-query data" load-functional-query.log \
+  # Run some steps in parallel, with run-step-backgroundable / 
run-step-wait-all.
+  # This is effective on steps that take a long time and don't depend on each
+  # other. Functional-query takes about ~35 minutes, and TPC-H and TPC-DS can
+  # finish while functional-query is running.
+  run-step-backgroundable "Loading functional-query data" 
load-functional-query.log \
       load-data "functional-query" "exhaustive"
-  run-step "Loading TPC-H data" load-tpch.log load-data "tpch" "core"
+  run-step-backgroundable "Loading TPC-H data" load-tpch.log load-data "tpch" 
"core"
+  run-step-backgroundable "Loading TPC-DS data" load-tpcds.log load-data 
"tpcds" "core"
+  run-step-wait-all
   # Load tpch nested data.
   # TODO: Hacky and introduces more complexity into the system, but it is 
expedient.
   if [[ -n "$CM_HOST" ]]; then
@@ -459,7 +465,6 @@ if [ $SKIP_METADATA_LOAD -eq 0 ]; then
   fi
   run-step "Loading nested data" load-nested.log \
     ${IMPALA_HOME}/testdata/bin/load_nested.py ${LOAD_NESTED_ARGS:-}
-  run-step "Loading TPC-DS data" load-tpcds.log load-data "tpcds" "core"
   run-step "Loading auxiliary workloads" load-aux-workloads.log 
load-aux-workloads
   run-step "Loading dependent tables" copy-and-load-dependent-tables.log \
       copy-and-load-dependent-tables

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/76111ce1/testdata/bin/run-hive-server.sh
----------------------------------------------------------------------
diff --git a/testdata/bin/run-hive-server.sh b/testdata/bin/run-hive-server.sh
index 530b804..42d95b5 100755
--- a/testdata/bin/run-hive-server.sh
+++ b/testdata/bin/run-hive-server.sh
@@ -72,7 +72,7 @@ ${CLUSTER_BIN}/wait-for-metastore.py 
--transport=${METASTORE_TRANSPORT}
 if [ ${ONLY_METASTORE} -eq 0 ]; then
   # Starts a HiveServer2 instance on the port specified by the 
HIVE_SERVER2_THRIFT_PORT
   # environment variable.
-  hive --service hiveserver2 > ${LOGDIR}/hive-server2.out 2>&1 &
+  HADOOP_HEAPSIZE="512" hive --service hiveserver2 > 
${LOGDIR}/hive-server2.out 2>&1 &
 
   # Wait for the HiveServer2 service to come up because callers of this script
   # may rely on it being available.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/76111ce1/testdata/bin/run-step.sh
----------------------------------------------------------------------
diff --git a/testdata/bin/run-step.sh b/testdata/bin/run-step.sh
index 45c5774..9943013 100755
--- a/testdata/bin/run-step.sh
+++ b/testdata/bin/run-step.sh
@@ -48,5 +48,39 @@ function run-step {
     return 1
   fi
   ELAPSED_TIME=$(($SECONDS - $START_TIME))
-  echo "    OK (Took: $(($ELAPSED_TIME/60)) min $(($ELAPSED_TIME%60)) sec)"
+  echo "  ${MSG} OK (Took: $(($ELAPSED_TIME/60)) min $(($ELAPSED_TIME%60)) 
sec)"
+}
+
+# Array to manage background tasks.
+declare -a RUN_STEP_PIDS
+declare -a RUN_STEP_MSGS
+
+# Runs the given step in the background. Many tasks may be started in the
+# background, and all of them must be joined together with run-step-wait-all.
+# No dependency management or maximums on number of tasks are provided.
+function run-step-backgroundable {
+  MSG="$1"
+  run-step "$@" &
+  local pid=$!
+  echo "Started ${MSG} in background; pid $pid."
+  RUN_STEP_PIDS+=($pid)
+  RUN_STEP_MSGS+=("${MSG}")
+}
+
+# Wait for all tasks that were run with run-step-backgroundable.
+# Fails if any of the background tasks has failed. Clears $RUN_STEP_PIDS.
+function run-step-wait-all {
+  local ret=0
+  for idx in "${!RUN_STEP_PIDS[@]}"; do
+    pid="${RUN_STEP_PIDS[$idx]}"
+    msg="${RUN_STEP_MSGS[$idx]}"
+
+    if ! wait $pid; then
+      ret=1
+      echo "Background task $msg (pid $pid) failed."
+    fi
+  done
+  RUN_STEP_PIDS=()
+  RUN_STEP_MSGS=()
+  return $ret
 }

[1/2] incubator-impala git commit: IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

Reply via email to