[ 
https://issues.apache.org/jira/browse/HADOOP-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111776#comment-14111776
 ] 

Allen Wittenauer edited comment on HADOOP-10996 at 8/27/14 3:32 AM:
--------------------------------------------------------------------

TL;DR: Absolute best bet is to put configs some place and assign 
HADOOP_CONF_DIR to it so that you have absolute certainty on where Hadoop is 
pulling settings.  

Longer story:

Currently, if HADOOP_CONF_DIR isn't defined, it uses a bit of interesting logic 
to locate it:

1. Figure out where HADOOP_PREFIX is at. Is HADOOP_PREFIX defined? If not, then 
let's assume it's "what's called us/..".
2. Does HADOOP_PREFIX/conf/hadoop-env.sh exist? OK, then that must be 
HADOOP_CONF_DIR
3. No? OK, then HADOOP_CONF_DIR must be HADOOP_PREFIX/etc/hadoop.

What's fun about this and what you're doing is that HADOOP_CONF_DIR will get 
defined differently depending upon which bin dir you are using. :D

Fine, you say!  Let's just treat all _HOME/etc/hadoop and _HOME/conf as 
potentially valid.  Now we have a very interesting problem:  how do you define 
HADOOP_CONF_DIR?  Other stuff past Hadoop depends upon this being one 
directory.  We could pick the first one and then just shove the rest in the 
classpath and none would be the wiser!

Aha! But they would.  Which one takes precedence? What happens if there are 
conflicts? etc, etc. It gets messy very very fast. So... ABORT! ABORT!

(BTW, this is pretty much the same logic from branch-2. It could be argued that 
there should be a check to see if etc/hadoop is 'real' too and abort on it.  
Here's the fun part: the shell code works perfectly fine if -env.sh is empty 
now... the NN will still crash though.  That said, if HADOOP-10879 gets 
finished, this will almost certainly need to get revisited.  Probably better to 
look for core-site.xml, honestly, since all of the sub-projects all depend upon 
that.  In other words, we could run through all of the _HOME, HADOOP_PREFIX, 
etc, and use the first core-site.xml we find as the 'real' HADOOP_CONF_DIR.)


was (Author: aw):
TL;DR: Absolute best bet is to put configs some place and assign 
HADOOP_CONF_DIR to it so that you have absolute certainty on where Hadoop is 
pulling settings.  

Longer story:

Currently, if HADOOP_CONF_DIR isn't defined, it uses a bit of twisted logic to 
locate it:

1. Figure out where HADOOP_PREFIX is at. Is HADOOP_PREFIX defined? If not, then 
let's assume it's "what's called us/..".
2. Does HADOOP_PREFIX/conf/hadoop-env.sh exist? OK, then that must be 
HADOOP_CONF_DIR
3. No? OK, then HADOOP_CONF_DIR must be HADOOP_PREFIX/etc/hadoop.

What's fun about this and what you're doing is that HADOOP_CONF_DIR will get 
defined differently depending upon which bin dir you are using. :D

Fine, you say!  Let's just treat all *_HOME/etc/hadoop and *_HOME/conf as 
potentially valid.  Now we have a very interesting problem:  how do you define 
HADOOP_CONF_DIR?  Other stuff past Hadoop depends upon this being *one* 
directory.  We could pick the first one and then just shove the rest in the 
classpath and none would be the wiser!

Aha! But they would.  Which one takes precedence? What happens if there are 
conflicts? etc, etc. It gets messy very very fast. So... ABORT! ABORT!

(BTW, this is pretty much the same logic from branch-2. It could be argued that 
there should be a check to see if etc/hadoop is 'real' too and abort on it.  
Here's the fun part: the shell code works perfectly fine if *-env.sh is empty 
now... the NN will still crash though.  That said, if HADOOP-10879 gets 
finished, this will almost certainly need to get revisited.  Probably better to 
look for core-site.xml, honestly, since all of the sub-projects all depend upon 
that.  In other words, we could run through all of the *_HOME, HADOOP_PREFIX, 
etc, and use the first core-site.xml we find as the 'real' HADOOP_CONF_DIR.)

> [post-HADOOP-9902] Stop violence in the *_HOME
> ----------------------------------------------
>
>                 Key: HADOOP-10996
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10996
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 3.0.0
>            Reporter: Allen Wittenauer
>            Assignee: Allen Wittenauer
>         Attachments: HADOOP-10996-01.patch, HADOOP-10996-02.patch, 
> HADOOP-10996.patch
>
>
> (Updated from original description)
> There are various places where the various HOME directories are missing or 
> mis-defined. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to