shanyu zhao created HADOOP-10245:
------------------------------------

             Summary: Hadoop command line always appends "-Xmx" option twice
                 Key: HADOOP-10245
                 URL: https://issues.apache.org/jira/browse/HADOOP-10245
             Project: Hadoop Common
          Issue Type: Bug
          Components: bin
    Affects Versions: 2.2.0
            Reporter: shanyu zhao
            Assignee: shanyu zhao


The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with 
"-Xmx" options twice. The impact is that any user defined HADOOP_HEAP_SIZE env 
variable will take no effect because it is overwritten by the second "-Xmx" 
option.

For example, here is the java cmd generated for command "hadoop fs -ls /", 
Notice that there are two "-Xmx" options: "-Xmx1000m" and "-Xmx512m" in the 
command line:

java -Xmx1000m  -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log 
-Dhadoop.root.logger=INFO,c
onsole,DRFA -Xmx512m  -Dhadoop.security.logger=INFO,RFAS -classpath XXX 
org.apache.hadoop.fs.FsShell -ls /

Here is the root cause:
The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls 
hadoop-env.sh. 
In hadoop.sh, the command line is generated by the following pseudo code:
java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ...

In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as "-Xmx1000m" if user 
didn't set $HADOOP_HEAP_SIZE env variable.

In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this:
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"

To fix this problem, we should remove the "-Xmx512m" from HADOOP_CLIENT_OPTS. 
If we really want to change the memory settings we need to use 
$HADOOP_HEAP_SIZE env variable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to