[
https://issues.apache.org/jira/browse/HADOOP-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
tim.wu updated HADOOP-8274:
---------------------------
Description:
The standalone model is ok. But, in pseudo or cluster model, it example always
throw errors, even I just run wordcount example.
The HDFS works fine, but tasktracker can not create threads(jvm) for new job.
It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.
The reason looks like that in windows, Java can not recognize a symlink of
folder as a folder.
The detail description is as following,
======================================================================================================
First, the error log of tasktracker is like:
======================
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID:
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM :
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks
it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0
: Child Error
java.io.IOException: Task process exit with nonzero status of -1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch :
attempt_201203280212_0005_m_000002_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots
: 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1
slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task:
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task:
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
=======================================
I've tried to remote debug tasktracker. In
org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean,
String[]) line: 97:
public static void createTaskAttemptLogDir(TaskAttemptID taskID,
boolean isCleanup, String[] localDirs) throws IOException{
String cleanupSuffix = isCleanup ? ".cleanup" : "";
String strAttemptLogDir = getTaskAttemptLogDir(taskID,
cleanupSuffix, localDirs);
File attemptLogDir = new File(strAttemptLogDir);
if (!attemptLogDir.mkdirs()) {
throw new IOException("Creation of " + attemptLogDir + " failed.");
}
String strLinkAttemptLogDir =
getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar +
taskID.toString() + cleanupSuffix;
if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
throw new IOException("Creation of symlink from " +
strLinkAttemptLogDir + " to " + yestrAttemptLogDir +
" failed.");
}
//Set permissions for target attempt log dir
FsPermission userOnly = new FsPermission((short) 0777); //FsPermission
userOnly = new FsPermission((short) 0700);
FileUtil.setPermission(attemptLogDir, userOnly);
}
and symlink() function
public static int symLink(String target, String linkname) throws IOException{
String cmd = "ln -s " + target + " " + linkname;
Process p = Runtime.getRuntime().exec(cmd, null);
int returnVal = -1;
try{
returnVal = p.waitFor();
} catch(InterruptedException e){
//do nothing as of yet
}
if (returnVal != 0) {
LOG.warn("Command '" + cmd + "' failed " + returnVal +
" with: " + copyStderr(p));
}
return returnVal;
}
we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke
"ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx.
In my case,
strLinkAttemptLogDir =
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
After a subtrack is created by tasktracker, it runs error in the following
function:
in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String,
String, String, List<String>, List<String>, File, String, String) line: 107
...............
//mkdir the loglocation
String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
if (!localFs.mkdirs(new Path(logLocation))) {
throw new IOException("Mkdirs failed to create "
+ logLocation);
}
..............
mkdir() return false, because logLocation is a symlink file. In my case, it is
ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
If I open it from explorer in windows, it is just a file, but not a folder
or shortcut. And its content is like,
<symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
Because the mkdir() is
public boolean mkdirs(Path f) throws IOException {
Path parent = f.getParent();
File p2f = pathToFile(f);
return (parent == null || mkdirs(parent)) &&
(p2f.mkdir() || p2f.isDirectory());
}
So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for
java, it is a file. Hence, IOException("Mkdirs failed to create
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
will be throws in child threads, and return -1. Then, we will get the above
exception in main thread.
Is it any way to close this symlink? Or any other way I can follow?
BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and
my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu
under cygwin's. However, in deed , it create a folder of
d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it
correct?
was:
The standalone model is ok. But, in pseudo or cluster model, it example always
throw errors, even I just run wordcount example.
The HDFS works fine, but tasktracker can not create threads(jvm) for new
job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.
The reason looks like that in windows, Java can not recognize a symlink of
folder as a folder.
The detail description is as following,
======================================================================================================
First, the error log of tasktracker is like:
======================
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID:
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM :
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks
it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0
: Child Error
java.io.IOException: Task process exit with nonzero status of -1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch :
attempt_201203280212_0005_m_000002_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots
: 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1
slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task:
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task:
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
=======================================
I've tried to remote debug tasktracker. In
org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean,
String[]) line: 97:
public static void createTaskAttemptLogDir(TaskAttemptID taskID,
boolean isCleanup, String[] localDirs) throws IOException{
String cleanupSuffix = isCleanup ? ".cleanup" : "";
String strAttemptLogDir = getTaskAttemptLogDir(taskID,
cleanupSuffix, localDirs);
File attemptLogDir = new File(strAttemptLogDir);
if (!attemptLogDir.mkdirs()) {
throw new IOException("Creation of " + attemptLogDir + " failed.");
}
String strLinkAttemptLogDir =
getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar +
taskID.toString() + cleanupSuffix;
if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
throw new IOException("Creation of symlink from " +
strLinkAttemptLogDir + " to " + yestrAttemptLogDir +
" failed.");
}
//Set permissions for target attempt log dir
FsPermission userOnly = new FsPermission((short) 0777); //FsPermission
userOnly = new FsPermission((short) 0700);
FileUtil.setPermission(attemptLogDir, userOnly);
}
and symlink() function
public static int symLink(String target, String linkname) throws IOException{
String cmd = "ln -s " + target + " " + linkname;
Process p = Runtime.getRuntime().exec(cmd, null);
int returnVal = -1;
try{
returnVal = p.waitFor();
} catch(InterruptedException e){
//do nothing as of yet
}
if (returnVal != 0) {
LOG.warn("Command '" + cmd + "' failed " + returnVal +
" with: " + copyStderr(p));
}
return returnVal;
}
we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke
"ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx.
In my case,
strLinkAttemptLogDir =
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
After a subtrack is created by tasktracker, it runs error in the following
function:
in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String,
String, String, List<String>, List<String>, File, String, String) line: 107
...............
//mkdir the loglocation
String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
if (!localFs.mkdirs(new Path(logLocation))) {
throw new IOException("Mkdirs failed to create "
+ logLocation);
}
..............
mkdir() return false, because logLocation is a symlink file. In my case, it is
ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
If I open it from explorer in windows, it is just a file, but not a folder
or shortcut. And its content is like,
<symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
Because the mkdir() is
public boolean mkdirs(Path f) throws IOException {
Path parent = f.getParent();
File p2f = pathToFile(f);
return (parent == null || mkdirs(parent)) &&
(p2f.mkdir() || p2f.isDirectory());
}
So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for
java, it is a file. Hence, IOException("Mkdirs failed to create
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
will be throws in child threads, and return -1. Then, we will get the above
exception in main thread.
Is it any way to close this symlink? Or any other way I can follow?
BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and
my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu
under cygwin's. However, in deed , it create a folder of
d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it
correct?
Summary: In pseudo or cluster model Under cygwin, tasktracker can not
create a new job because of symlink problem. (was: Under cygwin, hadoop throws
exception in pseudo or cluster model)
> In pseudo or cluster model Under cygwin, tasktracker can not create a new job
> because of symlink problem.
> ---------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-8274
> URL: https://issues.apache.org/jira/browse/HADOOP-8274
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.20.205.0, 1.0.0, 1.0.1, 0.22.0
> Environment: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0
> Reporter: tim.wu
>
> The standalone model is ok. But, in pseudo or cluster model, it example
> always throw errors, even I just run wordcount example.
> The HDFS works fine, but tasktracker can not create threads(jvm) for new job.
> It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.
> The reason looks like that in windows, Java can not recognize a symlink of
> folder as a folder.
> The detail description is as following,
> ======================================================================================================
> First, the error log of tasktracker is like:
> ======================
> 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID:
> jvm_201203280212_0005_m_-1386636958
> 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner
> jvm_201203280212_0005_m_-1386636958 spawned.
> 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed
> jvm_201203280212_0005_m_-1386636958 but just removed
> 12/03/28 14:35:17 INFO mapred.JvmManager: JVM :
> jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks
> it ran: 0
> 12/03/28 14:35:17 WARN mapred.TaskRunner:
> attempt_201203280212_0005_m_000002_0 : Child Error
> java.io.IOException: Task process exit with nonzero status of -1.
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots :
> 2
> 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask):
> attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
> 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch :
> attempt_201203280212_0005_m_000002_1 which needs 1 slots
> 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free
> slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which
> needs 1 slots
> 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for
> task: attempt_201203280212_0005_m_000002_0
> java.io.FileNotFoundException:
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
> (The system cannot find the path specified)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:120)
> at
> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
> at
> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
> at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
> at
> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
> at
> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for
> task: attempt_201203280212_0005_m_000002_0
> java.io.FileNotFoundException:
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
> (The system cannot find the path specified)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:120)
> at
> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
> at
> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
> at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
> at
> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
> at
> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> =======================================
> I've tried to remote debug tasktracker. In
> org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID,
> boolean, String[]) line: 97:
> public static void createTaskAttemptLogDir(TaskAttemptID taskID,
> boolean isCleanup, String[] localDirs) throws IOException{
> String cleanupSuffix = isCleanup ? ".cleanup" : "";
> String strAttemptLogDir = getTaskAttemptLogDir(taskID,
> cleanupSuffix, localDirs);
> File attemptLogDir = new File(strAttemptLogDir);
> if (!attemptLogDir.mkdirs()) {
> throw new IOException("Creation of " + attemptLogDir + " failed.");
> }
> String strLinkAttemptLogDir =
> getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar +
> taskID.toString() + cleanupSuffix;
> if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
> throw new IOException("Creation of symlink from " +
> strLinkAttemptLogDir + " to " +
> yestrAttemptLogDir +
> " failed.");
> }
> //Set permissions for target attempt log dir
> FsPermission userOnly = new FsPermission((short) 0777); //FsPermission
> userOnly = new FsPermission((short) 0700);
> FileUtil.setPermission(attemptLogDir, userOnly);
> }
> and symlink() function
> public static int symLink(String target, String linkname) throws IOException{
> String cmd = "ln -s " + target + " " + linkname;
> Process p = Runtime.getRuntime().exec(cmd, null);
> int returnVal = -1;
> try{
> returnVal = p.waitFor();
> } catch(InterruptedException e){
> //do nothing as of yet
> }
> if (returnVal != 0) {
> LOG.warn("Command '" + cmd + "' failed " + returnVal +
> " with: " + copyStderr(p));
> }
> return returnVal;
> }
> we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then
> invoke "ln -s " to create its symlink under
> /logs/userlog/job-xxx/attermp-xxxx.
> In my case,
> strLinkAttemptLogDir =
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
> strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
> After a subtrack is created by tasktracker, it runs error in the following
> function:
> in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String,
> String, String, List<String>, List<String>, File, String, String) line: 107
>
> ...............
> //mkdir the loglocation
> String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
> if (!localFs.mkdirs(new Path(logLocation))) {
> throw new IOException("Mkdirs failed to create "
> + logLocation);
> }
> ..............
> mkdir() return false, because logLocation is a symlink file. In my case, it
> is
> ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
> If I open it from explorer in windows, it is just a file, but not a folder
> or shortcut. And its content is like,
>
> <symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
>
> Because the mkdir() is
> public boolean mkdirs(Path f) throws IOException {
> Path parent = f.getParent();
> File p2f = pathToFile(f);
> return (parent == null || mkdirs(parent)) &&
> (p2f.mkdir() || p2f.isDirectory());
> }
> So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for
> java, it is a file. Hence, IOException("Mkdirs failed to create
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
> will be throws in child threads, and return -1. Then, we will get the above
> exception in main thread.
> Is it any way to close this symlink? Or any other way I can follow?
> BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and
> my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu
> under cygwin's. However, in deed , it create a folder of
> d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is
> it correct?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira