[ 
https://issues.apache.org/jira/browse/HADOOP-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tim.wu updated HADOOP-8274:
---------------------------

    Description: 
The standalone model is ok. But, in pseudo or cluster model, it always throw 
errors, even I just run wordcount example.

The HDFS works fine, but tasktracker can not create threads(jvm) for new job.  
It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.

The reason looks like that in windows, Java can not recognize a symlink of 
folder as a folder. 

The detail description is as following,

======================================================================================================

First, the error log of tasktracker is like:


======================
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: 
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner 
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed 
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM : 
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks 
it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 
: Child Error
java.io.IOException: Task process exit with nonzero status of -1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): 
attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : 
attempt_201203280212_0005_m_000002_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots 
: 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 
slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: 
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException: 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
 (The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at 
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at 
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at 
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at 
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: 
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException: 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
 (The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at 
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at 
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at 
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at 
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

=======================================

I've tried to remote debug tasktracker. In 

org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, 
String[]) line: 97:
public static void createTaskAttemptLogDir(TaskAttemptID taskID,
      boolean isCleanup, String[] localDirs) throws IOException{
    String cleanupSuffix = isCleanup ? ".cleanup" : "";
    String strAttemptLogDir = getTaskAttemptLogDir(taskID, 
        cleanupSuffix, localDirs);
    File attemptLogDir = new File(strAttemptLogDir);
    if (!attemptLogDir.mkdirs()) {
      throw new IOException("Creation of " + attemptLogDir + " failed.");
    }
    String strLinkAttemptLogDir = 
        getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + 
        taskID.toString() + cleanupSuffix;
    if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
      throw new IOException("Creation of symlink from " + 
                            strLinkAttemptLogDir + " to " + yestrAttemptLogDir +
                            " failed.");
    }
    //Set permissions for target attempt log dir 
    FsPermission userOnly = new FsPermission((short) 0777); //FsPermission 
userOnly = new FsPermission((short) 0700);
    FileUtil.setPermission(attemptLogDir, userOnly);
  }
and  symlink() function
public static int symLink(String target, String linkname) throws IOException{
    String cmd = "ln -s " + target + " " + linkname;
    Process p = Runtime.getRuntime().exec(cmd, null);
    int returnVal = -1;
    try{
      returnVal = p.waitFor();
    } catch(InterruptedException e){
      //do nothing as of yet
    }
    if (returnVal != 0) {
      LOG.warn("Command '" + cmd + "' failed " + returnVal + 
               " with: " + copyStderr(p));
    }
    return returnVal;
  }

we know hadoop will create a log folder in ${hadoop.tmp.dir},  and then invoke 
"ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. 

In my case, 
strLinkAttemptLogDir = 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1

After a subtrack is created by tasktracker, it runs error in the following 
function:

in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, 
String, String, List<String>, List<String>, File, String, String) line: 107     
   
      ...............
      //mkdir the loglocation
      String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
      if (!localFs.mkdirs(new Path(logLocation))) {
        throw new IOException("Mkdirs failed to create " 
                   + logLocation);
      }
     ..............

mkdir() return false, because logLocation is a symlink file. In my case, it is 
ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
  If I open it from explorer in windows, it is just a file,  but not a folder 
or shortcut. And its content is like,
         
<symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
 

Because the mkdir() is
public boolean mkdirs(Path f) throws IOException {
    Path parent = f.getParent();
    File p2f = pathToFile(f);
    return (parent == null || mkdirs(parent)) &&
      (p2f.mkdir() || p2f.isDirectory());
  }

So, p2f.isDirectory returns false.  And p2f.isFile will return true. So, for 
java, it is a file. Hence, IOException("Mkdirs failed to create 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
will be throws in child threads, and return -1.  Then, we will get the above 
exception in main thread.

Is it any way to close this symlink? Or any other way I can follow?

BTW, in core-site.xml, I set  hadoop.tmp.dir = /tmp/hadoop-${user.name},  and 
my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu  
under cygwin's.   However, in deed , it create  a folder of 
d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu.  Is it 
correct?

  was:
The standalone model is ok. But, in pseudo or cluster model, it example always 
throw errors, even I just run wordcount example.

The HDFS works fine, but tasktracker can not create threads(jvm) for new job.  
It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.

The reason looks like that in windows, Java can not recognize a symlink of 
folder as a folder. 

The detail description is as following,

======================================================================================================

First, the error log of tasktracker is like:


======================
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: 
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner 
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed 
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM : 
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks 
it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 
: Child Error
java.io.IOException: Task process exit with nonzero status of -1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): 
attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : 
attempt_201203280212_0005_m_000002_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots 
: 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 
slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: 
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException: 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
 (The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at 
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at 
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at 
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at 
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: 
attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException: 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
 (The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at 
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at 
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at 
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at 
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

=======================================

I've tried to remote debug tasktracker. In 

org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, 
String[]) line: 97:
public static void createTaskAttemptLogDir(TaskAttemptID taskID,
      boolean isCleanup, String[] localDirs) throws IOException{
    String cleanupSuffix = isCleanup ? ".cleanup" : "";
    String strAttemptLogDir = getTaskAttemptLogDir(taskID, 
        cleanupSuffix, localDirs);
    File attemptLogDir = new File(strAttemptLogDir);
    if (!attemptLogDir.mkdirs()) {
      throw new IOException("Creation of " + attemptLogDir + " failed.");
    }
    String strLinkAttemptLogDir = 
        getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + 
        taskID.toString() + cleanupSuffix;
    if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
      throw new IOException("Creation of symlink from " + 
                            strLinkAttemptLogDir + " to " + yestrAttemptLogDir +
                            " failed.");
    }
    //Set permissions for target attempt log dir 
    FsPermission userOnly = new FsPermission((short) 0777); //FsPermission 
userOnly = new FsPermission((short) 0700);
    FileUtil.setPermission(attemptLogDir, userOnly);
  }
and  symlink() function
public static int symLink(String target, String linkname) throws IOException{
    String cmd = "ln -s " + target + " " + linkname;
    Process p = Runtime.getRuntime().exec(cmd, null);
    int returnVal = -1;
    try{
      returnVal = p.waitFor();
    } catch(InterruptedException e){
      //do nothing as of yet
    }
    if (returnVal != 0) {
      LOG.warn("Command '" + cmd + "' failed " + returnVal + 
               " with: " + copyStderr(p));
    }
    return returnVal;
  }

we know hadoop will create a log folder in ${hadoop.tmp.dir},  and then invoke 
"ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. 

In my case, 
strLinkAttemptLogDir = 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1

After a subtrack is created by tasktracker, it runs error in the following 
function:

in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, 
String, String, List<String>, List<String>, File, String, String) line: 107     
   
      ...............
      //mkdir the loglocation
      String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
      if (!localFs.mkdirs(new Path(logLocation))) {
        throw new IOException("Mkdirs failed to create " 
                   + logLocation);
      }
     ..............

mkdir() return false, because logLocation is a symlink file. In my case, it is 
ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
  If I open it from explorer in windows, it is just a file,  but not a folder 
or shortcut. And its content is like,
         
<symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
 

Because the mkdir() is
public boolean mkdirs(Path f) throws IOException {
    Path parent = f.getParent();
    File p2f = pathToFile(f);
    return (parent == null || mkdirs(parent)) &&
      (p2f.mkdir() || p2f.isDirectory());
  }

So, p2f.isDirectory returns false.  And p2f.isFile will return true. So, for 
java, it is a file. Hence, IOException("Mkdirs failed to create 
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
will be throws in child threads, and return -1.  Then, we will get the above 
exception in main thread.

Is it any way to close this symlink? Or any other way I can follow?

BTW, in core-site.xml, I set  hadoop.tmp.dir = /tmp/hadoop-${user.name},  and 
my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu  
under cygwin's.   However, in deed , it create  a folder of 
d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu.  Is it 
correct?

        Summary: In pseudo or cluster model under Cygwin, tasktracker can not 
create a new job because of symlink problem.  (was: In pseudo or cluster model 
Under cygwin, tasktracker can not create a new job because of symlink problem.)
    
> In pseudo or cluster model under Cygwin, tasktracker can not create a new job 
> because of symlink problem.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8274
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8274
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0, 1.0.0, 1.0.1, 0.22.0
>         Environment: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0
>            Reporter: tim.wu
>
> The standalone model is ok. But, in pseudo or cluster model, it always throw 
> errors, even I just run wordcount example.
> The HDFS works fine, but tasktracker can not create threads(jvm) for new job. 
>  It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.
> The reason looks like that in windows, Java can not recognize a symlink of 
> folder as a folder. 
> The detail description is as following,
> ======================================================================================================
> First, the error log of tasktracker is like:
> ======================
> 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: 
> jvm_201203280212_0005_m_-1386636958
> 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner 
> jvm_201203280212_0005_m_-1386636958 spawned.
> 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed 
> jvm_201203280212_0005_m_-1386636958 but just removed
> 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : 
> jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks 
> it ran: 0
> 12/03/28 14:35:17 WARN mapred.TaskRunner: 
> attempt_201203280212_0005_m_000002_0 : Child Error
> java.io.IOException: Task process exit with nonzero status of -1.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 
> 2
> 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): 
> attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
> 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : 
> attempt_201203280212_0005_m_000002_1 which needs 1 slots
> 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free 
> slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which 
> needs 1 slots
> 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for 
> task: attempt_201203280212_0005_m_000002_0
> java.io.FileNotFoundException: 
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
>  (The system cannot find the path specified)
>         at java.io.FileInputStream.open(Native Method)
>         at java.io.FileInputStream.<init>(FileInputStream.java:120)
>         at 
> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
>         at 
> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
>         at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
>         at 
> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
>         at 
> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>         at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>         at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>         at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>         at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>         at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>         at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>         at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>         at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>         at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>         at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>         at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>         at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for 
> task: attempt_201203280212_0005_m_000002_0
> java.io.FileNotFoundException: 
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
>  (The system cannot find the path specified)
>         at java.io.FileInputStream.open(Native Method)
>         at java.io.FileInputStream.<init>(FileInputStream.java:120)
>         at 
> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
>         at 
> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
>         at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
>         at 
> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
>         at 
> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>         at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>         at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>         at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>         at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>         at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>         at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>         at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>         at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>         at org.mortbay.jetty.Server.handle(Server.java:326)
>         at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>         at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>         at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>         at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> =======================================
> I've tried to remote debug tasktracker. In 
> org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, 
> boolean, String[]) line: 97:
> public static void createTaskAttemptLogDir(TaskAttemptID taskID,
>       boolean isCleanup, String[] localDirs) throws IOException{
>     String cleanupSuffix = isCleanup ? ".cleanup" : "";
>     String strAttemptLogDir = getTaskAttemptLogDir(taskID, 
>         cleanupSuffix, localDirs);
>     File attemptLogDir = new File(strAttemptLogDir);
>     if (!attemptLogDir.mkdirs()) {
>       throw new IOException("Creation of " + attemptLogDir + " failed.");
>     }
>     String strLinkAttemptLogDir = 
>         getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + 
>         taskID.toString() + cleanupSuffix;
>     if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) {
>       throw new IOException("Creation of symlink from " + 
>                             strLinkAttemptLogDir + " to " + 
> yestrAttemptLogDir +
>                             " failed.");
>     }
>     //Set permissions for target attempt log dir 
>     FsPermission userOnly = new FsPermission((short) 0777); //FsPermission 
> userOnly = new FsPermission((short) 0700);
>     FileUtil.setPermission(attemptLogDir, userOnly);
>   }
> and  symlink() function
> public static int symLink(String target, String linkname) throws IOException{
>     String cmd = "ln -s " + target + " " + linkname;
>     Process p = Runtime.getRuntime().exec(cmd, null);
>     int returnVal = -1;
>     try{
>       returnVal = p.waitFor();
>     } catch(InterruptedException e){
>       //do nothing as of yet
>     }
>     if (returnVal != 0) {
>       LOG.warn("Command '" + cmd + "' failed " + returnVal + 
>                " with: " + copyStderr(p));
>     }
>     return returnVal;
>   }
> we know hadoop will create a log folder in ${hadoop.tmp.dir},  and then 
> invoke "ln -s " to create its symlink under 
> /logs/userlog/job-xxx/attermp-xxxx. 
> In my case, 
> strLinkAttemptLogDir = 
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
> strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
> After a subtrack is created by tasktracker, it runs error in the following 
> function:
> in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, 
> String, String, List<String>, List<String>, File, String, String) line: 107   
>      
>       ...............
>       //mkdir the loglocation
>       String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
>       if (!localFs.mkdirs(new Path(logLocation))) {
>         throw new IOException("Mkdirs failed to create " 
>                    + logLocation);
>       }
>      ..............
> mkdir() return false, because logLocation is a symlink file. In my case, it 
> is 
> ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
>   If I open it from explorer in windows, it is just a file,  but not a folder 
> or shortcut. And its content is like,
>          
> <symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
>  
> Because the mkdir() is
> public boolean mkdirs(Path f) throws IOException {
>     Path parent = f.getParent();
>     File p2f = pathToFile(f);
>     return (parent == null || mkdirs(parent)) &&
>       (p2f.mkdir() || p2f.isDirectory());
>   }
> So, p2f.isDirectory returns false.  And p2f.isFile will return true. So, for 
> java, it is a file. Hence, IOException("Mkdirs failed to create 
> D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
> will be throws in child threads, and return -1.  Then, we will get the above 
> exception in main thread.
> Is it any way to close this symlink? Or any other way I can follow?
> BTW, in core-site.xml, I set  hadoop.tmp.dir = /tmp/hadoop-${user.name},  and 
> my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu  
> under cygwin's.   However, in deed , it create  a folder of 
> d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu.  Is 
> it correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to