[
https://issues.apache.org/jira/browse/HADOOP-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236645#comment-13236645
]
FKorning commented on HADOOP-7682:
----------------------------------
There's a bunch of issues at work. I've patched this up locally
on my own 1.0.2-SNAPSHOT, but it takes a lot of yak-shaving to fix.
---
First you need to set up hadoop-1.0.1 including source, ant, ivy,
and cygwin with ssh/ssl and tcp_wrappers.
Then use sshd_config to create a cyg_server priviledged user.
>From an admin cygwin shell, you then have to edit the /etc/passwd
file and give that user a valid shell and user home, change the
password for the user, and finally generate ssh keys for the user
and copy the user's id_rsa.pub public key into ~/.ssh/authorized_keys.
if done right you should be able to ssh cyg_server@localhost.
---
Now the main problem is a confusion between the hadoop shell scripts
that expect unix paths like /tmp, and the haddop java binaries who
interpret this path as C:\tmp.
Unfortunately, neither Cygwin symlinks nor even Windows NT Junctions
are supported by the java io filesystem. Thus the only way to get
around this is to enforce the cygwin paths to be identical to windows
paths.
I get around this by creating a circular symlink in "/cygwin" -> "/".
To avoid confusion with "C:" drive mappings, all my paths are relative.
This means that windows "\cygwin\tmp" equals cygwin's "/cygwin/tmp".
For pid files use /cygwin/tmp/
For tmp file use /cygwin/tmp/haddop-${USER}/
For log files use /cygwin/tmp/haddop-${USER}/logs/
---
First the ssh slaves invocation warpper is broken because it fails to
provide the user's ssh login, which isn't defaulted to in cygwin openssh.
slaves.sh:
for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do
ssh -l $USER $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \
2>&1 | sed "s/^/$slave: /" &
if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then
sleep $HADOOP_SLAVE_SLEEP
fi
done
Next the hadoop shell scripts are broken. you need to fix the environments
for cygwin paths in hadoop-env.sh, and then make sure this file is invoked
by both hadoop-config.sh, and finally the hadoop* sh wrapper script. For me
its JRE java invocation was also broken, so I provide the whole srcript below.
hadoop-env.sh:
HADOOP_PID_DIR=/cygwin/tmp/
HADOOP_TMP_DIR=/cygwin/tmp/hadoop-${USER}
HADOOP_LOG_DIR=/cygwin/tmp/hadoop-${USER}/logs
hadoop (sh):
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The Hadoop command script
#
# Environment Variables
#
# JAVA_HOME The java implementation to use. Overrides JAVA_HOME.
#
# HADOOP_CLASSPATH Extra Java CLASSPATH entries.
#
# HADOOP_USER_CLASSPATH_FIRST When defined, the HADOOP_CLASSPATH is
# added in the beginning of the global
# classpath. Can be defined, for example,
# by doing
# export HADOOP_USER_CLASSPATH_FIRST=true
#
# HADOOP_HEAPSIZE The maximum amount of heap to use, in MB.
# Default is 1000.
#
# HADOOP_OPTS Extra Java runtime options.
#
# HADOOP_NAMENODE_OPTS These options are added to HADOOP_OPTS
# HADOOP_CLIENT_OPTS when the respective command is run.
# HADOOP_{COMMAND}_OPTS etc HADOOP_JT_OPTS applies to JobTracker
# for e.g. HADOOP_CLIENT_OPTS applies to
# more than one command (fs, dfs, fsck,
# dfsadmin etc)
#
# HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_HOME}/conf.
#
# HADOOP_ROOT_LOGGER The root appender. Default is INFO,console
#
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`
cygwin=false
case "`uname`" in
CYGWIN*) cygwin=true;;
esac
if [ -e "$bin"/../libexec/hadoop-config.sh ]; then
. "$bin"/../libexec/hadoop-config.sh
else
. "$bin"/hadoop-config.sh
fi
# if no args specified, show usage
if [ $# = 0 ]; then
echo "Usage: hadoop [--config confdir] COMMAND"
echo "where COMMAND is one of:"
echo " namenode -format format the DFS filesystem"
echo " secondarynamenode run the DFS secondary namenode"
echo " namenode run the DFS namenode"
echo " datanode run a DFS datanode"
echo " dfsadmin run a DFS admin client"
echo " mradmin run a Map-Reduce admin client"
echo " fsck run a DFS filesystem checking utility"
echo " fs run a generic filesystem user client"
echo " balancer run a cluster balancing utility"
echo " fetchdt fetch a delegation token from the NameNode"
echo " jobtracker run the MapReduce job Tracker node"
echo " pipes run a Pipes job"
echo " tasktracker run a MapReduce task Tracker node"
echo " historyserver run job history servers as a standalone daemon"
echo " job manipulate MapReduce jobs"
echo " queue get information regarding JobQueues"
echo " version print the version"
echo " jar <jar> run a jar file"
echo " distcp <srcurl> <desturl> copy file or directories recursively"
echo " archive -archiveName NAME -p <parent path> <src>* <dest> create a
hadoop archive"
echo " classpath prints the class path needed to get the"
echo " Hadoop jar and the required libraries"
echo " daemonlog get/set the log level for each daemon"
echo " or"
echo " CLASSNAME run the class named CLASSNAME"
echo "Most commands print help when invoked w/o parameters."
exit 1
fi
# get arguments
COMMAND=$1
shift
# Determine if we're starting a secure datanode, and if so, redefine
appropriate variables
if [ "$COMMAND" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n
"$HADOOP_SECURE_DN_USER" ]; then
HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR
HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR
HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER
starting_secure_dn="true"
fi
if [ "$JAVA_HOME" != "" ]; then
#echo "JAVA_HOME: $JAVA_HOME"
JAVA_HOME="$JAVA_HOME"
fi
# some Java parameters
if $cygwin; then
JAVA_HOME=`cygpath -w "$JAVA_HOME"`
#echo "cygwin JAVA_HOME: $JAVA_HOME"
fi
if [ "$JAVA_HOME" == "" ]; then
echo "Error: JAVA_HOME is not set: $JAVA_HOME"
exit 1
fi
JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx1000m
# check envvars which might override default args
if [ "$HADOOP_HEAPSIZE" != "" ]; then
#echo "run with heapsize $HADOOP_HEAPSIZE"
JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"
#echo $JAVA_HEAP_MAX
fi
# CLASSPATH initially contains $HADOOP_CONF_DIR
CLASSPATH="${HADOOP_CONF_DIR}"
if [ "$HADOOP_USER_CLASSPATH_FIRST" != "" ] && [ "$HADOOP_CLASSPATH" != "" ] ;
then
CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH}
fi
CLASSPATH=${CLASSPATH}:$JAVA_HOME/lib/tools.jar
# for developers, add Hadoop classes to CLASSPATH
if [ -d "$HADOOP_HOME/build/classes" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/classes
fi
if [ -d "$HADOOP_HOME/build/webapps" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build
fi
if [ -d "$HADOOP_HOME/build/test/classes" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/test/classes
fi
if [ -d "$HADOOP_HOME/build/tools" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_HOME/build/tools
fi
# so that filenames w/ spaces are handled correctly in loops below
IFS=
# for releases, add core hadoop jar & webapps to CLASSPATH
if [ -e $HADOOP_PREFIX/share/hadoop/hadoop-core-* ]; then
# binary layout
if [ -d "$HADOOP_PREFIX/share/hadoop/webapps" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_PREFIX/share/hadoop
fi
for f in $HADOOP_PREFIX/share/hadoop/hadoop-core-*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
# add libs to CLASSPATH
for f in $HADOOP_PREFIX/share/hadoop/lib/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
for f in $HADOOP_PREFIX/share/hadoop/lib/jsp-2.1/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
for f in $HADOOP_PREFIX/share/hadoop/hadoop-tools-*.jar; do
TOOL_PATH=${TOOL_PATH}:$f;
done
else
# tarball layout
if [ -d "$HADOOP_HOME/webapps" ]; then
CLASSPATH=${CLASSPATH}:$HADOOP_HOME
fi
for f in $HADOOP_HOME/hadoop-core-*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
# add libs to CLASSPATH
for f in $HADOOP_HOME/lib/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
if [ -d "$HADOOP_HOME/build/ivy/lib/Hadoop/common" ]; then
for f in $HADOOP_HOME/build/ivy/lib/Hadoop/common/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
fi
for f in $HADOOP_HOME/lib/jsp-2.1/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
for f in $HADOOP_HOME/hadoop-tools-*.jar; do
TOOL_PATH=${TOOL_PATH}:$f;
done
for f in $HADOOP_HOME/build/hadoop-tools-*.jar; do
TOOL_PATH=${TOOL_PATH}:$f;
done
fi
# add user-specified CLASSPATH last
if [ "$HADOOP_USER_CLASSPATH_FIRST" = "" ] && [ "$HADOOP_CLASSPATH" != "" ];
then
CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH}
fi
# default log directory & file
if [ "$HADOOP_LOG_DIR" = "" ]; then
HADOOP_LOG_DIR="$HADOOP_HOME/logs"
fi
if [ "$HADOOP_LOGFILE" = "" ]; then
HADOOP_LOGFILE='hadoop.log'
fi
# default policy file for service-level authorization
if [ "$HADOOP_POLICYFILE" = "" ]; then
HADOOP_POLICYFILE="hadoop-policy.xml"
fi
# restore ordinary behaviour
unset IFS
# figure out which class to run
if [ "$COMMAND" = "classpath" ] ; then
if $cygwin; then
CLASSPATH=`cygpath -wp "$CLASSPATH"`
fi
echo $CLASSPATH
exit
elif [ "$COMMAND" = "namenode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
elif [ "$COMMAND" = "secondarynamenode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode'
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS"
elif [ "$COMMAND" = "datanode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
if [ "$starting_secure_dn" = "true" ]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
fi
elif [ "$COMMAND" = "fs" ] ; then
CLASS=org.apache.hadoop.fs.FsShell
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfs" ] ; then
CLASS=org.apache.hadoop.fs.FsShell
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "dfsadmin" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "mradmin" ] ; then
CLASS=org.apache.hadoop.mapred.tools.MRAdmin
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "fsck" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DFSck
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "balancer" ] ; then
CLASS=org.apache.hadoop.hdfs.server.balancer.Balancer
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_BALANCER_OPTS"
elif [ "$COMMAND" = "fetchdt" ] ; then
CLASS=org.apache.hadoop.hdfs.tools.DelegationTokenFetcher
elif [ "$COMMAND" = "jobtracker" ] ; then
CLASS=org.apache.hadoop.mapred.JobTracker
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOBTRACKER_OPTS"
elif [ "$COMMAND" = "historyserver" ] ; then
CLASS=org.apache.hadoop.mapred.JobHistoryServer
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOB_HISTORYSERVER_OPTS"
elif [ "$COMMAND" = "tasktracker" ] ; then
CLASS=org.apache.hadoop.mapred.TaskTracker
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_TASKTRACKER_OPTS"
elif [ "$COMMAND" = "job" ] ; then
CLASS=org.apache.hadoop.mapred.JobClient
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "queue" ] ; then
CLASS=org.apache.hadoop.mapred.JobQueueClient
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "pipes" ] ; then
CLASS=org.apache.hadoop.mapred.pipes.Submitter
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "version" ] ; then
CLASS=org.apache.hadoop.util.VersionInfo
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "jar" ] ; then
CLASS=org.apache.hadoop.util.RunJar
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "distcp" ] ; then
CLASS=org.apache.hadoop.tools.DistCp
CLASSPATH=${CLASSPATH}:${TOOL_PATH}
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "daemonlog" ] ; then
CLASS=org.apache.hadoop.log.LogLevel
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "archive" ] ; then
CLASS=org.apache.hadoop.tools.HadoopArchives
CLASSPATH=${CLASSPATH}:${TOOL_PATH}
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
elif [ "$COMMAND" = "sampler" ] ; then
CLASS=org.apache.hadoop.mapred.lib.InputSampler
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
else
CLASS=$COMMAND
fi
# cygwin path translation
if $cygwin; then
JAVA_HOME=`cygpath -w "$JAVA_HOME"`
CLASSPATH=`cygpath -wp "$CLASSPATH"`
HADOOP_HOME=`cygpath -w "$HADOOP_HOME"`
HADOOP_LOG_DIR=`cygpath -w "$HADOOP_LOG_DIR"`
TOOL_PATH=`cygpath -wp "$TOOL_PATH"`
fi
# setup 'java.library.path' for native-hadoop code if necessary
JAVA_LIBRARY_PATH=''
if [ -d "${HADOOP_HOME}/build/native" -o -d "${HADOOP_HOME}/lib/native" -o -e
"${HADOOP_PREFIX}/lib/libhadoop.a" ]; then
JAVA_PLATFORM=`${JAVA} -classpath ${CLASSPATH} -Xmx32m
${HADOOP_JAVA_PLATFORM_OPTS} org.apache.hadoop.util.PlatformName | sed -e "s/
/_/g"`
#echo "JAVA_PLATFORM: $JAVA_PLATFORM"
if [ "$JAVA_PLATFORM" = "Windows_7-amd64-64" ]; then
JSVC_ARCH="amd64"
elif [ "$JAVA_PLATFORM" = "Linux-amd64-64" ]; then
JSVC_ARCH="amd64"
else
JSVC_ARCH="i386"
fi
if [ -d "$HADOOP_HOME/build/native" ]; then
JAVA_LIBRARY_PATH=${HADOOP_HOME}/build/native/${JAVA_PLATFORM}/lib
fi
if [ -d "${HADOOP_HOME}/lib/native" ]; then
if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_HOME}/lib/native/${JAVA_PLATFORM}
else
JAVA_LIBRARY_PATH=${HADOOP_HOME}/lib/native/${JAVA_PLATFORM}
fi
fi
if [ -e "${HADOOP_PREFIX}/lib/libhadoop.a" ]; then
JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/lib
fi
fi
# cygwin path translation
if $cygwin; then
JAVA_LIBRARY_PATH=`cygpath -wp "$JAVA_LIBRARY_PATH"`
PATH="/cygwin/bin:/cygwin/usr/bin:`cygpath -p ${PATH}`"
fi
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.tmp.dir=$HADOOP_TMP_DIR"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.file=$HADOOP_LOGFILE"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_HOME"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
HADOOP_OPTS="$HADOOP_OPTS
-Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}"
#turn security logger on the namenode and jobtracker only
if [ $COMMAND = "namenode" ] || [ $COMMAND = "jobtracker" ]; then
HADOOP_OPTS="$HADOOP_OPTS
-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,DRFAS}"
else
HADOOP_OPTS="$HADOOP_OPTS
-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}"
fi
if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH"
fi
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.policy.file=$HADOOP_POLICYFILE"
# Check to see if we should start a secure datanode
if [ "$starting_secure_dn" = "true" ]; then
if [ "$HADOOP_PID_DIR" = "" ]; then
HADOOP_SECURE_DN_PID="/tmp/hadoop_secure_dn.pid"
else
HADOOP_SECURE_DN_PID="$HADOOP_PID_DIR/hadoop_secure_dn.pid"
fi
exec "$HADOOP_HOME/libexec/jsvc.${JSVC_ARCH}" -Dproc_$COMMAND -outfile
"$HADOOP_LOG_DIR/jsvc.out" \
-errfile
"$HADOOP_LOG_DIR/jsvc.err" \
-pidfile
"$HADOOP_SECURE_DN_PID" \
-nodetach \
-user "$HADOOP_SECURE_DN_USER" \
-cp "$CLASSPATH" \
$JAVA_HEAP_MAX $HADOOP_OPTS \
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"
else
# run it
exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS -classpath
"$CLASSPATH" $CLASS "$@"
fi
----
Next the hadoop fs and utilities are broken, as they expect shells
with POSIX /bin executables in their path (bash,chmod,chown,chgrp)
For various reasons it's a real bad idea to add "/cygwin/bin" to your
windows path, so we're going to have to fix the utility classes to
be cygwin aware and use the "/cygwin/bin" binaries instead.
This is why you need the source, because we're going to have to fix
the java source and recompile the hadoop core libraries (and why you
need ant ivy).
----
Before we do this, the contrib Gridmix is broken as it uses a strange
generic Enum code that just craps out in jdk/jre 1.7 and above.
The fix is to dumb it down and use untyped Enums.
Gridmix.java:
/*
private <T> String getEnumValues(Enum<? extends T>[] e) {
StringBuilder sb = new StringBuilder();
String sep = "";
for (Enum<? extends T> v : e) {
sb.append(sep);
sb.append(v.name());
sep = "|";
}
return sb.toString();
}
*/
private String getEnumValues(Enum[] e) {
StringBuilder sb = new StringBuilder();
String sep = "";
for (Enum v : e) {
sb.append(sep);
sb.append(v.name());
sep = "|";
}
return sb.toString();
}
---
next first the ivy build.xml and build-contrib scripts are broken,
as they fail to set the correct compiler javac.target=1.7 everywhere.
modify all of these to include the following in all javac targets:
build-contrib.xml:
<property name="javac.debug" value="on"/>
<property name="javac.version" value="1.7"/>
...
<!-- ====================================================== -->
<!-- Compile a Hadoop contrib's files -->
<!-- ====================================================== -->
<target name="compile" depends="init, ivy-retrieve-common"
unless="skip.contrib">
<echo message="contrib: ${name}"/>
<javac
encoding="${build.encoding}"
srcdir="${src.dir}"
includes="**/*.java"
destdir="${build.classes}"
target="${javac.version}"
source="${javac.version}"
optimize="${javac.optimize}"
debug="${javac.debug}"
deprecation="${javac.deprecation}">
<classpath refid="contrib-classpath"/>
</javac>
</target>
---
Next we fix the hadoop utilities Shell.java to use cygwin paths:
Shell.java:
/** Set to true on Windows platforms */
public static final boolean WINDOWS /* borrowed from Path.WINDOWS */
= System.getProperty("os.name").startsWith("Windows");
/** a Unix command to get the current user's name */
public final static String USER_NAME_COMMAND = (WINDOWS ?
"/cygwin/bin/whoami" : "whoami");
/** a Unix command to get the current user's groups list */
public static String[] getGroupsCommand() {
return new String[]{ (WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c",
"groups"};
}
/** a Unix command to get a given user's groups list */
public static String[] getGroupsForUserCommand(final String user) {
//'groups username' command return is non-consistent across different unixes
return new String [] {(WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c", "id
-Gn " + user};
}
/** a Unix command to get a given netgroup's user list */
public static String[] getUsersForNetgroupCommand(final String netgroup) {
//'groups username' command return is non-consistent across different unixes
return new String [] {(WINDOWS ? "/cygwin/bin/bash" : "bash"), "-c",
"getent netgroup " + netgroup};
}
/** Return a Unix command to get permission information. */
public static String[] getGET_PERMISSION_COMMAND() {
//force /bin/ls, except on windows.
return new String[] {(WINDOWS ? "/cygwin/bin/ls" : "/bin/ls"), "-ld"};
}
/** a Unix command to set permission */
public static final String SET_PERMISSION_COMMAND = (WINDOWS ?
"/cygwin/bin/chmod" : "chmod");
/** a Unix command to set owner */
public static final String SET_OWNER_COMMAND = (WINDOWS ? "/cygwin/bin/chown"
: "chown");
/** a Unix command to set group */
public static final String SET_GROUP_COMMAND = (WINDOWS ? "/cygwin/bin/chgrp"
: "chgrp");
/** a Unix command to get ulimit of a process. */
public static final String ULIMIT_COMMAND = "ulimit";
----
Lastly and despite this fix, hadoop filesystem's FileUtil complains
about RawLocalFileSystem, breaking during the directory creation and
verification because the shell's return value is improperly parsed.
You can fix this in a number of ways. I took the lazy approach and
just made all mkdir functions catch all IOExceptions silently.
RawLocalFileSystem.java:
/**
* Creates the specified directory hierarchy. Does not
* treat existence as an error.
*/
public boolean mkdirs(Path f) throws IOException {
boolean b = false;
try {
Path parent = f.getParent();
File p2f = pathToFile(f);
b = (parent == null || mkdirs(parent))
&& (p2f.mkdir() || p2f.isDirectory());
} catch (IOException e) {}
return b;
}
/** {@inheritDoc} */
@Override
public boolean mkdirs(Path f, FsPermission permission) throws IOException {
boolean b = false;
try {
b = mkdirs(f);
setPermission(f, permission);
} catch (IOException e) {}
return b;
}
---
Finally, rebuild hadoop with "ant -f build.xml compile".
copy the jars in the build directory oevrwriting the
existing jars in the hadoop home parent directory.
reformat the namenode.
and run start-all.sh.
you should see 4 java processes for the namenode, datanode,
jobtracker, and tasktracker. that was a lot of yak shaving
just to get this running.
> taskTracker could not start because "Failed to set permissions" to "ttprivate
> to 0700"
> --------------------------------------------------------------------------------------
>
> Key: HADOOP-7682
> URL: https://issues.apache.org/jira/browse/HADOOP-7682
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.20.203.0, 0.20.205.0, 1.0.0
> Environment: OS:WindowsXP SP3 , Filesystem :NTFS, cygwin 1.7.9-1,
> jdk1.6.0_05
> Reporter: Magic Xie
>
> ERROR org.apache.hadoop.mapred.TaskTracker:Can not start task tracker because
> java.io.IOException:Failed to set permissions of
> path:/tmp/hadoop-cyg_server/mapred/local/ttprivate to 0700
> at
> org.apache.hadoop.fs.RawLocalFileSystem.checkReturnValue(RawLocalFileSystem.java:525)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:499)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:318)
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183)
> at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:635)
> at org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:1328)
> at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3430)
> Since hadoop0.20.203 when the TaskTracker initialize, it checks the
> permission(TaskTracker Line 624) of
> (org.apache.hadoop.mapred.TaskTracker.TT_LOG_TMP_DIR,org.apache.hadoop.mapred.TaskTracker.TT_PRIVATE_DIR,
>
> org.apache.hadoop.mapred.TaskTracker.TT_PRIVATE_DIR).RawLocalFileSystem(http://svn.apache.org/viewvc/hadoop/common/tags/release-0.20.203.0/src/core/org/apache/hadoop/fs/RawLocalFileSystem.java?view=markup)
> call setPermission(Line 481) to deal with it, setPermission works fine on
> *nx, however,it dose not alway works on windows.
> setPermission call setReadable of Java.io.File in the line 498, but according
> to the Table1 below provided by oracle,setReadable(false) will always return
> false on windows, the same as setExecutable(false).
> http://java.sun.com/developer/technicalArticles/J2SE/Desktop/javase6/enhancements/
> is it cause the task tracker "Failed to set permissions" to "ttprivate to
> 0700"?
> Hadoop 0.20.202 works fine in the same environment.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira