Ah, yes. I have seen this issue. Typically, it is because you have JAVA_HOME set on your host,but not on your Mesos Agent. If you run a Marathon job and output "env" you will see the JAVA_HOME environment variable is missing. You would need to set it in your agent init configurations as export JAVA_HOME=<pathtojava>
Thanks, Elizabeth On Wed, Nov 4, 2015 at 1:20 AM, haosdent <[email protected]> wrote: > how about add this flag when launch slave > --executor_environment_variables='{"HADOOP_HOME": "/opt/hadoop-2.6.0"}' ? > > On Wed, Nov 4, 2015 at 5:13 PM, Du, Fan <[email protected]> wrote: > >> >> >> On 2015/11/4 17:09, haosdent wrote: >> >>> I notice >>> ``` >>> "user":"root" >>> ``` >>> Do you make sure could execute `hadoop version` under root? >>> >> >> >> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# whoami >> root >> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# hadoop version >> Hadoop 2.6.0 >> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r >> e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 >> Compiled by jenkins on 2014-11-13T21:10Z >> Compiled with protoc 2.5.0 >> From source with checksum 18e43357c8f927c0695f1e9522859d6a >> This command was run using >> /opt/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar >> >> [root@tylersburg spark-1.5.1-bin-hadoop2.6]# ls -hl >> /opt/hadoop-2.6.0/bin/hadoop >> -rwxr-xr-x. 1 root root 5.4K Nov 3 08:36 /opt/hadoop-2.6.0/bin/hadoop >> >> >> >> On Wed, Nov 4, 2015 at 4:56 PM, Du, Fan <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> On 2015/11/4 16:40, Tim Chen wrote: >>> >>> What OS are you running this with? >>> >>> And I assume if you run /bin/sh and try to run hadoop it can be >>> found in >>> your PATH as well? >>> >>> >>> I'm using CentOS-7.2 >>> >>> # /bin/sh hadoop version >>> Hadoop 2.6.0 >>> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r >>> e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 >>> Compiled by jenkins on 2014-11-13T21:10Z >>> Compiled with protoc 2.5.0 >>> >From source with checksum 18e43357c8f927c0695f1e9522859d6a >>> This command was run using >>> /opt/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar >>> >>> >>> >>> Tim >>> >>> On Wed, Nov 4, 2015 at 12:34 AM, Du, Fan <[email protected] >>> <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote: >>> >>> Hi Mesos experts >>> >>> I setup a small mesos cluster with 1 master and 6 slaves, >>> and deploy hdfs on the same cluster topology, both with >>> root user role. >>> >>> #cat spark-1.5.1-bin-hadoop2.6/conf/spark-env.sh >>> export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so >>> export >>> >>> >>> JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.el7_1.x86_64/jre/ >>> export >>> SPARK_EXECUTOR_URI=hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz >>> >>> When I run a simple SparkPi test >>> #export MASTER=mesos://Mesos_Master_IP:5050 >>> #spark-1.5.1-bin-hadoop2.6/bin/run-example SparkPi 10000 >>> >>> I got this on slaves: >>> >>> I1104 22:24:02.238471 14518 fetcher.cpp:414] Fetcher Info: >>> >>> >>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/root","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"hdfs:\/\/test\/spark-1.5.1-bin-hadoop2.6.tgz"}}],"sandbox_directory":"\/ws\/mesos\/slaves\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/frameworks\/556b49c1-7e6a-4f99-b320-c3f0c849e836-0003\/executors\/556b49c1-7e6a-4f99-b320-c3f0c849e836-S6\/runs\/9ec70f41-67d5-4a95-999f-933f3aa9e261","user":"root"} >>> I1104 22:24:02.240910 14518 fetcher.cpp:369] Fetching URI >>> 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz' >>> I1104 22:24:02.240931 14518 fetcher.cpp:243] Fetching >>> directly into >>> the sandbox directory >>> I1104 22:24:02.240952 14518 fetcher.cpp:180] Fetching URI >>> 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz' >>> E1104 22:24:02.245264 14518 shell.hpp:90] Command 'hadoop >>> version >>> 2>&1' failed; this is the output: >>> sh: hadoop: command not found >>> Failed to fetch 'hdfs://test/spark-1.5.1-bin-hadoop2.6.tgz': >>> Skipping fetch with Hadoop client: Failed to execute >>> 'hadoop version >>> 2>&1'; the command was either not found or exited with a >>> non-zero >>> exit status: 127 >>> Failed to synchronize with slave (it's probably exited) >>> >>> >>> As for "sh: hadoop: command not found", it indicates when >>> mesos >>> executes "hadoop version" command, >>> it cannot find any valid hadoop command, but actually when >>> I log >>> into the slave, "hadoop vesion" >>> runs well, because I update hadoop path into PATH env. >>> >>> cat ~/.bashrc >>> export >>> >>> >>> JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91-2.6.2.1.el7_1.x86_64/jre/ >>> export HADOOP_PREFIX=/opt/hadoop-2.6.0 >>> export HADOOP_HOME=$HADOOP_PREFIX >>> export HADOOP_COMMON_HOME=$HADOOP_PREFIX >>> export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop >>> export HADOOP_HDFS_HOME=$HADOOP_PREFIX >>> export HADOOP_MAPRED_HOME=$HADOOP_PREFIX >>> export HADOOP_YARN_HOME=$HADOOP_PREFIX >>> export PATH=$PATH:$HADOOP_PREFIX/sbin:$HADOOP_PREFIX/bin >>> >>> I also try to set hadoop_home when launching mesos-slave, >>> hmm, no >>> luck, the slave >>> complains it can find JAVA_HOME env when executing "hadoop >>> version" >>> >>> Finally I check the Mesos code where this error happens, it >>> looks >>> quite straight forward. >>> >>> ./src/hdfs/hdfs.hpp >>> 44 // HTTP GET on hostname:port and grab the information >>> in the >>> 45 // <title>...</title> (this is the best hack I can >>> think of to get >>> 46 // 'fs.default.name <http://fs.default.name> >>> <http://fs.default.name>' given the tools >>> >>> available). >>> 47 struct HDFS >>> 48 { >>> 49 // Look for `hadoop' first where proposed, >>> otherwise, look for >>> 50 // HADOOP_HOME, otherwise, assume it's on the PATH. >>> 51 explicit HDFS(const std::string& _hadoop) >>> 52 : hadoop(os::exists(_hadoop) >>> 53 ? _hadoop >>> 54 : (os::getenv("HADOOP_HOME").isSome() >>> 55 ? >>> path::join(os::getenv("HADOOP_HOME").get(), >>> "bin/hadoop") >>> 56 : "hadoop")) {} >>> 57 >>> 58 // Look for `hadoop' in HADOOP_HOME or assume it's >>> on the PATH. >>> 59 HDFS() >>> 60 : hadoop(os::getenv("HADOOP_HOME").isSome() >>> 61 ? >>> path::join(os::getenv("HADOOP_HOME").get(), >>> "bin/hadoop") >>> 62 : "hadoop") {} >>> 63 >>> 64 // Check if hadoop client is available at the path >>> that was set. >>> 65 // This can be done by executing `hadoop version` >>> command and >>> 66 // checking for status code == 0. >>> 67 Try<bool> available() >>> 68 { >>> 69 Try<std::string> command = strings::format("%s >>> version", >>> hadoop); >>> 70 >>> 71 CHECK_SOME(command); >>> 72 >>> 73 // We are piping stderr to stdout so that we can >>> see the >>> error (if >>> 74 // any) in the logs emitted by `os::shell()` in >>> case of >>> failure. >>> 75 Try<std::string> out = os::shell(command.get() + " >>> 2>&1"); >>> 76 >>> 77 if (out.isError()) { >>> 78 return Error(out.error()); >>> 79 } >>> 80 >>> 81 return true; >>> 82 } >>> >>> It puzzled me for a while, am I missing something obviously? >>> Thanks in advance. >>> >>> >>> >>> >>> >>> -- >>> Best Regards, >>> Haosdent Huang >>> >> > > > -- > Best Regards, > Haosdent Huang >

