Env: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0 So far, the standalone model is ok. But, in pseudo or cluster model, the wordcount example always throw errors.
The HDFS works fine, but tasktracker can not create threads(jvm) for new job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/. The error log of tasktracker is like: ====================== 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203280212_0005_m_-1386636958 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner jvm_201203280212_0005_m_-1386636958 spawned. 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed jvm_201203280212_0005_m_-1386636958 but just removed 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks it ran: 0 12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 : Child Error java.io.IOException: Task process exit with nonzero status of -1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 slots 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: attempt_201203280212_0005_m_000002_0 java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:120) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188) at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) ======================================= I've tried to remote debug tasktracker. In 1. org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, String[]) line: 97: 2. public static void createTaskAttemptLogDir(TaskAttemptID taskID, 3. boolean isCleanup, String[] localDirs) throws IOException{ 4. String cleanupSuffix = isCleanup ? ".cleanup" : ""; 5. String strAttemptLogDir = getTaskAttemptLogDir(taskID, 6. cleanupSuffix, localDirs); 7. File attemptLogDir = new File(strAttemptLogDir); 8. if (!attemptLogDir.mkdirs()) { 9. throw new IOException("Creation of " + attemptLogDir + " failed ."); 10. } 11. String strLinkAttemptLogDir = 12. getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar + 13. taskID.toString() + cleanupSuffix; 14. if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0) { 15. throw new IOException("Creation of symlink from " + 16. strLinkAttemptLogDir + " to " + yestrAttemptLogDir + 17. " failed."); 18. } 19. //Set permissions for target attempt log dir 20. FsPermission userOnly = new FsPermission((short) 0777); //FsPermission userOnly = new FsPermission((short) 0700); 21. FileUtil.setPermission(attemptLogDir, userOnly); 22. } and symlink() function 1. public static int symLink(String target, String linkname) throws IOException{ 2. String cmd = "ln -s " + target + " " + linkname; 3. Process p = Runtime.getRuntime().exec(cmd, null); 4. int returnVal = -1; 5. try{ 6. returnVal = p.waitFor(); 7. } catch(InterruptedException e){ 8. //do nothing as of yet 9. } 10. if (returnVal != 0) { 11. LOG.warn("Command '" + cmd + "' failed " + returnVal + 12. " with: " + copyStderr(p)); 13. } 14. return returnVal; 15. } we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke "ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx. In my case, 1. strLinkAttemptLogDir = D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 2. strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 After a subtrack is created by tasktracker, it runs error in the following function: 1. in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, String, String, List<String>, List<String>, File, String, String) line: 107 2. …............ 3. //mkdir the loglocation 4. String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString(); 5. if (!localFs.mkdirs(new Path(logLocation))) { 6. throw new IOException("Mkdirs failed to create " 7. + logLocation); 8. } 9. …........... mkdir() return false, because logLocation is a symlink file. In my case, it is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1. If I open it from explorer in windows, it is just a file, but not a folder or shortcut. And its content is like, <symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1 Because the mkdir() is 1. public boolean mkdirs(Path f) throws IOException { 2. Path parent = f.getParent(); 3. File p2f = pathToFile(f); 4. return (parent == null || mkdirs(parent)) && 5. (p2f.mkdir() || p2f.isDirectory()); 6. } So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for java, it is a file. Hence, IOException(“Mkdirs failed to create D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1”) will be throws in child threads, and return -1. Then, we will get the above exception in main thread. Is it any way to close this symlink? Or any other way I can follow? BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu under cygwin's. However, in deed , it create a folder of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it correct? -- Best, WU Pengcheng ( Tim )