IMPALA-6898: Avoid duplicate Kudu load during full dataload testdata/bin/create-load-data.sh does bin/load-data.py for functional/exhaustive, tpch/core, and tpcds/core in a first phase, then it loads functional and tpch for Kudu in a second phase. For a full dataload, this second phase is not necessary. functional/exhaustive and tpch/core already include Kudu.
This avoids the second phase when doing a full dataload. The second phase is still necessary when loading from a snapshot, and this does not change that behavior. This saves a couple minutes off of full dataload. Change-Id: Ic023d230f99126ed37795106c38faae5f0cb608e Reviewed-on: http://gerrit.cloudera.org:8080/10128 Reviewed-by: Philip Zeyliger <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/b6a553ef Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/b6a553ef Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/b6a553ef Branch: refs/heads/2.x Commit: b6a553efa3afe177a68823331585705ee1ee1d17 Parents: d42f8d7 Author: Joe McDonnell <[email protected]> Authored: Thu Apr 19 16:14:03 2018 -0700 Committer: Impala Public Jenkins <[email protected]> Committed: Sat Apr 21 01:10:17 2018 +0000 ---------------------------------------------------------------------- testdata/bin/create-load-data.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/impala/blob/b6a553ef/testdata/bin/create-load-data.sh ---------------------------------------------------------------------- diff --git a/testdata/bin/create-load-data.sh b/testdata/bin/create-load-data.sh index e50515b..51ba449 100755 --- a/testdata/bin/create-load-data.sh +++ b/testdata/bin/create-load-data.sh @@ -540,8 +540,10 @@ elif [ "${TARGET_FILESYSTEM}" = "hdfs" ]; then load-data "functional-query" "core" "hbase/none" fi -if $KUDU_IS_SUPPORTED; then +if [[ $SKIP_METADATA_LOAD -eq 1 && $KUDU_IS_SUPPORTED ]]; then # Tests depend on the kudu data being clean, so load the data from scratch. + # This is only necessary if this is not a full dataload, because a full dataload + # already loads Kudu functional and TPC-H tables from scratch. run-step-backgroundable "Loading Kudu functional" load-kudu.log \ load-data "functional-query" "core" "kudu/none/none" force run-step-backgroundable "Loading Kudu TPCH" load-kudu-tpch.log \
