[
https://issues.apache.org/jira/browse/HADOOP-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111776#comment-14111776
]
Allen Wittenauer edited comment on HADOOP-10996 at 8/27/14 3:32 AM:
--------------------------------------------------------------------
TL;DR: Absolute best bet is to put configs some place and assign
HADOOP_CONF_DIR to it so that you have absolute certainty on where Hadoop is
pulling settings.
Longer story:
Currently, if HADOOP_CONF_DIR isn't defined, it uses a bit of interesting logic
to locate it:
1. Figure out where HADOOP_PREFIX is at. Is HADOOP_PREFIX defined? If not, then
let's assume it's "what's called us/..".
2. Does HADOOP_PREFIX/conf/hadoop-env.sh exist? OK, then that must be
HADOOP_CONF_DIR
3. No? OK, then HADOOP_CONF_DIR must be HADOOP_PREFIX/etc/hadoop.
What's fun about this and what you're doing is that HADOOP_CONF_DIR will get
defined differently depending upon which bin dir you are using. :D
Fine, you say! Let's just treat all _HOME/etc/hadoop and _HOME/conf as
potentially valid. Now we have a very interesting problem: how do you define
HADOOP_CONF_DIR? Other stuff past Hadoop depends upon this being one
directory. We could pick the first one and then just shove the rest in the
classpath and none would be the wiser!
Aha! But they would. Which one takes precedence? What happens if there are
conflicts? etc, etc. It gets messy very very fast. So... ABORT! ABORT!
(BTW, this is pretty much the same logic from branch-2. It could be argued that
there should be a check to see if etc/hadoop is 'real' too and abort on it.
Here's the fun part: the shell code works perfectly fine if -env.sh is empty
now... the NN will still crash though. That said, if HADOOP-10879 gets
finished, this will almost certainly need to get revisited. Probably better to
look for core-site.xml, honestly, since all of the sub-projects all depend upon
that. In other words, we could run through all of the _HOME, HADOOP_PREFIX,
etc, and use the first core-site.xml we find as the 'real' HADOOP_CONF_DIR.)
was (Author: aw):
TL;DR: Absolute best bet is to put configs some place and assign
HADOOP_CONF_DIR to it so that you have absolute certainty on where Hadoop is
pulling settings.
Longer story:
Currently, if HADOOP_CONF_DIR isn't defined, it uses a bit of twisted logic to
locate it:
1. Figure out where HADOOP_PREFIX is at. Is HADOOP_PREFIX defined? If not, then
let's assume it's "what's called us/..".
2. Does HADOOP_PREFIX/conf/hadoop-env.sh exist? OK, then that must be
HADOOP_CONF_DIR
3. No? OK, then HADOOP_CONF_DIR must be HADOOP_PREFIX/etc/hadoop.
What's fun about this and what you're doing is that HADOOP_CONF_DIR will get
defined differently depending upon which bin dir you are using. :D
Fine, you say! Let's just treat all *_HOME/etc/hadoop and *_HOME/conf as
potentially valid. Now we have a very interesting problem: how do you define
HADOOP_CONF_DIR? Other stuff past Hadoop depends upon this being *one*
directory. We could pick the first one and then just shove the rest in the
classpath and none would be the wiser!
Aha! But they would. Which one takes precedence? What happens if there are
conflicts? etc, etc. It gets messy very very fast. So... ABORT! ABORT!
(BTW, this is pretty much the same logic from branch-2. It could be argued that
there should be a check to see if etc/hadoop is 'real' too and abort on it.
Here's the fun part: the shell code works perfectly fine if *-env.sh is empty
now... the NN will still crash though. That said, if HADOOP-10879 gets
finished, this will almost certainly need to get revisited. Probably better to
look for core-site.xml, honestly, since all of the sub-projects all depend upon
that. In other words, we could run through all of the *_HOME, HADOOP_PREFIX,
etc, and use the first core-site.xml we find as the 'real' HADOOP_CONF_DIR.)
> [post-HADOOP-9902] Stop violence in the *_HOME
> ----------------------------------------------
>
> Key: HADOOP-10996
> URL: https://issues.apache.org/jira/browse/HADOOP-10996
> Project: Hadoop Common
> Issue Type: Improvement
> Components: scripts
> Affects Versions: 3.0.0
> Reporter: Allen Wittenauer
> Assignee: Allen Wittenauer
> Attachments: HADOOP-10996-01.patch, HADOOP-10996-02.patch,
> HADOOP-10996.patch
>
>
> (Updated from original description)
> There are various places where the various HOME directories are missing or
> mis-defined.
--
This message was sent by Atlassian JIRA
(v6.2#6252)