Repository: incubator-impala Updated Branches: refs/heads/master d3a6b0bf2 -> 00e3a55cb
IMPALA-5768: Better developer documentation Guide to important environment variables for build, impala paths and config cleanup. Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Reviewed-on: http://gerrit.cloudera.org:8080/7350 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/00e3a55c Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/00e3a55c Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/00e3a55c Branch: refs/heads/master Commit: 00e3a55cb8091b6d410fb2f40691150612900629 Parents: d3a6b0b Author: Zach Amsden <zams...@cloudera.com> Authored: Sat Jul 1 03:45:21 2017 +0000 Committer: Tim Armstrong <tarmstr...@cloudera.com> Committed: Thu Aug 10 06:12:08 2017 +0000 ---------------------------------------------------------------------- README.md | 73 +++++++++++++++++++++++++++++++++++++++++++++-- bin/impala-config.sh | 25 +++------------- 2 files changed, 74 insertions(+), 24 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/00e3a55c/README.md ---------------------------------------------------------------------- diff --git a/README.md b/README.md index 797e7c5..64e74fb 100644 --- a/README.md +++ b/README.md @@ -29,11 +29,78 @@ Impala's internals and architecture, visit the Impala only supports Linux at the moment. +## Export Control Notice + +This distribution uses cryptographic software and may be subject to export controls. +Please refer to EXPORT\_CONTROL.md for more information. + ## Build Instructions See bin/bootstrap_build.sh. -## Export Control Notice +### Detailed Build Notes -This distribution uses cryptographic software and may be subject to export controls. -Please refer to EXPORT\_CONTROL.md for more information. +Impala can be built with pre-built components, downloaded from S3, or can be +built with an in-place toolchain located in the thirdparty directory (not recommended). +The components needed to build Impala are Apache Hadoop, Hive, HBase, and Sentry. +If you need to manually override the locations or versions of these components, you +can do so through the environment variables and scripts listed below. + +##### Scripts and directories + +| Location | Purpose | +|------------------------------|---------| +| bin/impala-config.sh | This script must be sourced to setup all environment variables properly to allow other scripts to work | +| bin/impala-config-local.sh | A script can be created in this location to set local overrides for any environment variables | +| bin/impala-config-branch.sh | A version of the above that can be checked into a branch for convenience. | +| bin/bootstrap_build.sh | A helper script to bootstrap some of the build requirements. | +| bin/bootstrap_development.sh | A helper script to bootstrap a developer environment. Please read it before using. | +| be/build/ | Impala build output goes here. | +| be/generated-sources/ | Thrift and other generated source will be found here. | + +##### Build Related Variables + +| Environment variable | Default value | Description | +|----------------------|---------------|-------------| +| IMPALA_HOME | | Top level Impala directory | +| IMPALA_TOOLCHAIN | "${IMPALA_HOME}/toolchain" | Native toolchain directory (for compilers, libraries, etc.) | +| SKIP_TOOLCHAIN_BOOTSTRAP | "false" | Skips downloading the toolchain any python dependencies if "true" | +| CDH_COMPONENTS_HOME | "${IMPALA_HOME}/toolchain/cdh_components" OR "${IMPALA_HOME}/thirdparty" (if detected) | If a thirdparty directory is present, components found here will override anything in IMPALA_TOOLCHAIN. | +| CDH_MAJOR_VERSION | "5" | Identifier used to uniqueify paths for potentially incompatible component builds. | +| IMPALA_CONFIG_SOURCED | "1" | Set by ${IMPALA_HOME}/bin/impala-config.sh (internal use) | +| JAVA_HOME | "/usr/lib/jvm/${JAVA_VERSION}" | Used to locate Java | +| JAVA_VERSION | "java-7-oracle-amd64" | Can override to set a local Java version. | +| JAVA | "${JAVA_HOME}/bin/java" | Java binary location. | +| CLASSPATH | | See bin/set-classpath.sh for details. | +| PYTHONPATH | Will be changed to include: "${IMPALA_HOME}/shell/gen-py" "${IMPALA_HOME}/testdata" "${THRIFT_HOME}/python/lib/python2.7/site-packages" "${HIVE_HOME}/lib/py" "${IMPALA_HOME}/shell/ext-py/prettytable-0.7.1/dist/prettytable-0.7.1" "${IMPALA_HOME}/shell/ext-py/sasl-0.1.1/dist/sasl-0.1.1-py2.7-linux-x "${IMPALA_HOME}/shell/ext-py/sqlparse-0.1.14/dist/sqlparse-0.1.14-py2 | + +##### Source Directories for Impala + +| Environment variable | Default value | Description | +|----------------------|---------------|-------------| +| IMPALA_BE_DIR | "${IMPALA_HOME}/be" | Backend directory. Build output is also stored here. | +| IMPALA_FE_DIR | "${IMPALA_HOME}/fe" | Frontend directory | +| IMPALA_COMMON_DIR | "${IMPALA_HOME}/common" | Common code (thrift, function registry) | + +##### Various Compilation Settings + +| Environment variable | Default value | Description | +|----------------------|---------------|-------------| +| IMPALA_BUILD_THREADS | "8" or set to number of processors by default. | Used for make -j and distcc -j settings. | +| IMPALA_MAKE_FLAGS | "" | Any extra settings to pass to make. Also used when copying udfs / udas into HDFS. | +| USE_SYSTEM_GCC | "0" | If set to any other value, directs cmake to not set GCC_ROOT, CMAKE_C_COMPILER, CMAKE_CXX_COMPILER, as well as setting TOOLCHAIN_LINK_FLAGS | +| IMPALA_CXX_COMPILER | "default" | Used by cmake (cmake_modules/toolchain and clang_toolchain.cmake) to select gcc / clang | +| USE_GOLD_LINKER | "true" | Directs backend cmake to use gold. | +| IS_OSX | "false" | (Experimental) currently only used to disable Kudu. | + +##### Dependencies +| Environment variable | Default value | Description | +|----------------------|---------------|-------------| +| HADOOP_HOME | "${CDH_COMPONENTS_HOME}/hadoop-${IMPALA_HADOOP_VERSION}/" | Used to locate Hadoop | +| HADOOP_INCLUDE_DIR | "${HADOOP_HOME}/include" | For 'hdfs.h' | +| HADOOP_LIB_DIR | "${HADOOP_HOME}/lib" | For 'libhdfs.a' or 'libhdfs.so' | +| HIVE_HOME | "${CDH_COMPONENTS_HOME}/{hive-${IMPALA_HIVE_VERSION}/" | | +| HIVE_SRC_DIR | "${HIVE_HOME}/src" | Used to find Hive thrift files. | +| HBASE_HOME | "${CDH_COMPONENTS_HOME}/hbase-${IMPALA_HBASE_VERSION}/" | | +| SENTRY_HOME | "${CDH_COMPONENTS_HOME}/sentry-${IMPALA_SENTRY_VERSION}/" | Used to setup test data | +| THRIFT_HOME | "${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT_VERSION}" | | http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/00e3a55c/bin/impala-config.sh ---------------------------------------------------------------------- diff --git a/bin/impala-config.sh b/bin/impala-config.sh index 79adcb7..5470bb0 100755 --- a/bin/impala-config.sh +++ b/bin/impala-config.sh @@ -334,17 +334,12 @@ export IMPALA_CUSTOM_CLUSTER_TEST_LOGS_DIR="${IMPALA_LOGS_DIR}/custom_cluster_te # List of all Impala log dirs so they can be created by buildall.sh export IMPALA_ALL_LOGS_DIRS="${IMPALA_CLUSTER_LOGS_DIR} ${IMPALA_DATA_LOADING_LOGS_DIR} ${IMPALA_DATA_LOADING_SQL_DIR} - ${IMPALA_EE_TEST_LOGS_DIR} ${IMPALA_FE_TEST_COVERAGE_DIR} + ${IMPALA_FE_TEST_LOGS_DIR} ${IMPALA_FE_TEST_COVERAGE_DIR} ${IMPALA_BE_TEST_LOGS_DIR} ${IMPALA_EE_TEST_LOGS_DIR} ${IMPALA_CUSTOM_CLUSTER_TEST_LOGS_DIR}" # Reduce the concurrency for local tests to half the number of cores in the system. -# Note than nproc may not be available on older distributions (centos5.5) -if type nproc >/dev/null 2>&1; then - CORES=$(($(nproc) / 2)) -else - CORES=4 -fi +CORES=$(($(getconf _NPROCESSORS_ONLN) / 2)) export NUM_CONCURRENT_TESTS="${NUM_CONCURRENT_TESTS-${CORES}}" export KUDU_MASTER_HOSTS="${KUDU_MASTER_HOSTS:-127.0.0.1}" @@ -448,31 +443,23 @@ export USER="${USER-`id -un`}" # Enable if you suspect a JNI issue # TODO: figure out how to turn this off only the stuff that can't run with it. #LIBHDFS_OPTS="-Xcheck:jni -Xcheck:nabounds" -# - Points to the location of libbackend.so. export LIBHDFS_OPTS="${LIBHDFS_OPTS:-} -Djava.library.path=${HADOOP_LIB_DIR}/native/" -# READER BEWARE: This always points to the debug build. -# TODO: Consider having cmake scripts change this value depending on -# the build type. -LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service" + # IMPALA-5080: Our use of PermGen space sometimes exceeds the default maximum while # running tests that load UDF jars. LIBHDFS_OPTS="${LIBHDFS_OPTS} -XX:MaxPermSize=128m" -export ARTISTIC_STYLE_OPTIONS="$IMPALA_BE_DIR/.astylerc" - export IMPALA_SNAPPY_PATH="${IMPALA_TOOLCHAIN}/snappy-${IMPALA_SNAPPY_VERSION}/lib" export JAVA_LIBRARY_PATH="${IMPALA_SNAPPY_PATH}" -# So that the frontend tests and PlanService can pick up libbackend.so -# and other required libraries +# So that the frontend tests and PlanService can pick up required libraries LIB_JAVA=`find "${JAVA_HOME}/" -name libjava.so | head -1` LIB_JSIG=`find "${JAVA_HOME}/" -name libjsig.so | head -1` LIB_JVM=` find "${JAVA_HOME}/" -name libjvm.so | head -1` export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:-}:`dirname ${LIB_JAVA}`:`dirname ${LIB_JSIG}`" LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:`dirname ${LIB_JVM}`" LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${HADOOP_LIB_DIR}/native" -LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_HOME}/be/build/debug/service" LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_SNAPPY_PATH}" LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_LZO}/build" @@ -488,10 +475,6 @@ CLASSPATH="$IMPALA_FE_DIR/target/classes:$CLASSPATH" CLASSPATH="$IMPALA_FE_DIR/src/test/resources:$CLASSPATH" CLASSPATH="$LZO_JAR_PATH:$CLASSPATH" -# Setup aliases -# Helper alias to script that verifies and merges Gerrit changes -alias gerrit-verify-only="${IMPALA_AUX_TEST_HOME}/jenkins/gerrit-verify-only.sh" - # A marker in the environment to prove that we really did source this file export IMPALA_CONFIG_SOURCED=1