[
https://issues.apache.org/jira/browse/HDFS-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045694#comment-13045694
]
Eric Yang commented on HDFS-2045:
---------------------------------
HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_MAPRED_HOME were results of
splitting the source code into three different submodules. While this works
fine for developer to isolate each project, it makes configuration difficult
for production use. HDFS and MAPRED run as their own uid. The amount of
configuration just multiples.
To solve this problem, there are a couple options:
Option 1. Modify jar file which contains all common shell script in common jar
file, when binary tarball is built, the common shell scripts are rearranged
submerged into the binary tarball distribution, and completely remove
HADOOP_*_HOME environment variables. $HADOOP_PREFIX is the only hint
(generated from shell script path, no need to define in the environment) to all
hadoop programs where the bits are exactly layout. When HDFS or MAPREDUCE is
deployed, there is no need to deploy COMMON tarball. To make this work for
developers, *-config.sh should be moved to $HADOOP_PREFIX/libexec. During the
build process, hadoop-common-*.jar is extract for common shell scripts. Both
developer and binary layout are closer to each other. (When project is
converted to maven, this keeps hdfs/mapreduce loosely coupled and reduce
duplicated shell scripts.)
Option 2. Preserve HADOOP_*_HOME for source code execution. Environment driven
layout does not work on binary tarball. Change the prefix tarball from
hadoop-[common|mapred|hdfs]-0.23.0-SNAPSHOT to hadoop-[version] for easy
extraction.
Option 3. Enable HADOOP_*_HOME for binary tarball. (Risk of crashing the
system due to bad environment variable setup)
Option 4. Merge hdfs/mapreduce back to the same project, but create as
subdirectories to reduce duplicated shell scripts.
I am incline to vote for option 2.
> HADOOP_*_HOME environment variables no longer work for tar ball distributions
> -----------------------------------------------------------------------------
>
> Key: HDFS-2045
> URL: https://issues.apache.org/jira/browse/HDFS-2045
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Aaron T. Myers
>
> It used to be that you could do the following:
> # Run `ant bin-package' in your hadoop-common checkout.
> # Set HADOOP_COMMON_HOME to the built directory of hadoop-common.
> # Run `ant bin-package' in your hadoop-hdfs checkout.
> # Set HADOOP_HDFS_HOME to the built directory of hadoop-hdfs.
> # Set PATH to have HADOOP_HDFS_HOME/bin and HADOOP_COMMON_HOME/bin on it.
> # Run `hdfs'.
> \\
> \\
> As of HDFS-1963, this no longer works since hdfs-config.sh is looking in
> HADOOP_COMMON_HOME/bin/ for hadoop-config.sh, but it's being placed in
> HADOOP_COMMON_HOME/libexec.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira