[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031555#comment-13031555 ] Konstantin Shvachko commented on HDFS-1371: --- Projecting JT/TT blacklisting logic to replicas doesn't work very well when all replicas are corrupt. The client will not report in this case, although it is critical. The argument that it never happened in practice is not very strong as it can happen any time. One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1378: - Attachment: hdfs-1378.1.patch Updated patch which fixes the {{TestEditLog}} test failure. The only difference between my original patch and this one are these lines in {{TestEditLog.java}}: {noformat} -} catch (ChecksumException e) { +} catch (IOException e) { // expected + assertEquals(Cause of exception should be ChecksumException, + e.getCause().getClass(), ChecksumException.class); {noformat} I believe the other test failures are presently failing on trunk. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
[ https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1814: - Attachment: hdfs-1814.5.patch Updated patch addressing review comments. bq. I also noted that the ACLs are mostly listed in src/test/hadoop-policy.xml. Do you think we need to update that for any reason? (not sure I can figure out why it's in there) I did a little digging. I think this is only necessary if there are tests which exercise the authorization policy of this protocol. So, not necessary here. HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent --- Key: HDFS-1814 URL: https://issues.apache.org/jira/browse/HDFS-1814 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client, name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031596#comment-13031596 ] Hadoop QA commented on HDFS-1378: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478776/hdfs-1378.1.patch against trunk revision 1101753. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/479//console This message is automatically generated. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
[ https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031597#comment-13031597 ] Hadoop QA commented on HDFS-1814: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478780/hdfs-1814.5.patch against trunk revision 1101753. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/480//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/480//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/480//console This message is automatically generated. HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent --- Key: HDFS-1814 URL: https://issues.apache.org/jira/browse/HDFS-1814 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client, name-node Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J Chouraria reassigned HDFS-1615: --- Assignee: Harsh J Chouraria (was: Todd Lipcon) seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Harsh J Chouraria Labels: newbie After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1912) Update tests for FsShell standardized error messages
[ https://issues.apache.org/jira/browse/HDFS-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031793#comment-13031793 ] Daryn Sharp commented on HDFS-1912: --- Test failures are unrelated to this patch. Update tests for FsShell standardized error messages Key: HDFS-1912 URL: https://issues.apache.org/jira/browse/HDFS-1912 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-1912.patch Need to update the FsShell based tests for commonized error messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031837#comment-13031837 ] Aaron T. Myers commented on HDFS-1378: -- All of the test failures except for {{TestEditLogFileOutputStream}} are known to be failing on trunk. The {{TestEditLogFileOutputStream}} failure appears to be transient. It passes on my box, and this is the message it failed with in the Jenkins run: {{noformat}} java.net.BindException: Port in use: 0.0.0.0:50070 {{noformat}} Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031853#comment-13031853 ] Tanping Wang commented on HDFS-1371: In the case of all replicas are corrupted, there is nothing cluster can do to recover. Reporting this to the name node or not would not make a difference. We can argue that if a operator knows about the block corruption, he can choose to physically copy the file from somewhere else. However, in real life, based on Koji's experience, in the past couple years, this has never been a case on Yahoo's clusters that all block replicas are corrupted. Beyond this point, we do not want to too rely on a client to report block corruption and want to restrict the solution to just deal with a handicapped client. One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1899) GenericTestUtils.formatNamenode is misplaced
[ https://issues.apache.org/jira/browse/HDFS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1899: - Attachment: HDFS-1899.patch Moved formatNameNode() to DFSTestUtil GenericTestUtils.formatNamenode is misplaced Key: HDFS-1899 URL: https://issues.apache.org/jira/browse/HDFS-1899 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Todd Lipcon Labels: newbie Fix For: 0.23.0 Attachments: HDFS-1899.patch This function belongs in DFSTestUtil, the standard place for putting cluster-related utils. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1916) Fake jira for illustrating workflow (sorry)
[ https://issues.apache.org/jira/browse/HDFS-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1916: -- Attachment: hdfs-1916.txt Attached patch fixes the issue (not really) Fake jira for illustrating workflow (sorry) --- Key: HDFS-1916 URL: https://issues.apache.org/jira/browse/HDFS-1916 Project: Hadoop HDFS Issue Type: Task Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.23.0 Reporter: Todd Lipcon Priority: Trivial Attachments: hdfs-1916.txt The namenode explodes when it eats too much. Steps to reproduce: a) eat too much. b) explode -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1916) Fake jira for illustrating workflow (sorry)
[ https://issues.apache.org/jira/browse/HDFS-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1916: -- Assignee: Todd Lipcon Status: Patch Available (was: Open) Fake jira for illustrating workflow (sorry) --- Key: HDFS-1916 URL: https://issues.apache.org/jira/browse/HDFS-1916 Project: Hadoop HDFS Issue Type: Task Components: documentation Affects Versions: 0.21.0, 0.20.2, 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Attachments: hdfs-1916.txt The namenode explodes when it eats too much. Steps to reproduce: a) eat too much. b) explode -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1916) Fake jira for illustrating workflow (sorry)
[ https://issues.apache.org/jira/browse/HDFS-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031884#comment-13031884 ] Eli Collins commented on HDFS-1916: --- -1 This patch requires more work, adjust NUM_THREADS is not enough. Fake jira for illustrating workflow (sorry) --- Key: HDFS-1916 URL: https://issues.apache.org/jira/browse/HDFS-1916 Project: Hadoop HDFS Issue Type: Task Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Attachments: hdfs-1916.txt The namenode explodes when it eats too much. Steps to reproduce: a) eat too much. b) explode -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1916) Fake jira for illustrating workflow (sorry)
[ https://issues.apache.org/jira/browse/HDFS-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1916: -- Resolution: Invalid Status: Resolved (was: Patch Available) Fake jira for illustrating workflow (sorry) --- Key: HDFS-1916 URL: https://issues.apache.org/jira/browse/HDFS-1916 Project: Hadoop HDFS Issue Type: Task Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Attachments: hdfs-1916.txt The namenode explodes when it eats too much. Steps to reproduce: a) eat too much. b) explode -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1052) HDFS scalability with multiple namenodes
[ https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031891#comment-13031891 ] Hudson commented on HDFS-1052: -- Integrated in Hadoop-Mapreduce-trunk-Commit #663 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/663/]) MAPREDUCE-2467. HDFS-1052 changes break the raid contrib module in MapReduce. (suresh srinivas via mahadev) HDFS scalability with multiple namenodes Key: HDFS-1052 URL: https://issues.apache.org/jira/browse/HDFS-1052 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.22.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 0.23.0 Attachments: Block pool proposal.pdf, HDFS-1052.3.patch, HDFS-1052.4.patch, HDFS-1052.5.patch, HDFS-1052.6.patch, HDFS-1052.patch, Mulitple Namespaces5.pdf, high-level-design.pdf HDFS currently uses a single namenode that limits scalability of the cluster. This jira proposes an architecture to scale the nameservice horizontally using multiple namenodes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jolly Chen updated HDFS-1621: - Status: Patch Available (was: Open) Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jolly Chen updated HDFS-1621: - Attachment: hdfs-1621.txt Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1915) fuse-dfs does not support append
[ https://issues.apache.org/jira/browse/HDFS-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-1915: -- Component/s: (was: name-node) contrib/fuse-dfs Issue Type: New Feature (was: Bug) Summary: fuse-dfs does not support append (was: Error in create file while using FUSE ) Fuse-dfs does not currently support append (which is what the FTP client is trying to do). The fuse-dfs code in CDH3 btw is essentially the same as what's in trunk, the development is done on trunk first. fuse-dfs does not support append Key: HDFS-1915 URL: https://issues.apache.org/jira/browse/HDFS-1915 Project: Hadoop HDFS Issue Type: New Feature Components: contrib/fuse-dfs Affects Versions: 0.20.2 Environment: Ubuntu 10.04 LTS on EC2 Reporter: Sampath K Environment: CloudEra CDH3, EC2 cluster with 2 data nodes and 1 name node(Using ubuntu 10.04 LTS large instances), mounted hdfs in OS using fuse-dfs. Able to do HDFS fs -put but when I try to use a FTP client(ftp PUT) to do the same, I get the following error. I am using vsFTPd on the server. Changed the mounted folder permissions to a+w to rule out any WRITE permission issues. I was able to do a FTP GET on the same mounted volume. Please advise FTPd Log == Tue May 10 23:45:00 2011 [pid 2] CONNECT: Client 127.0.0.1 Tue May 10 23:45:09 2011 [pid 1] [ftpuser] OK LOGIN: Client 127.0.0.1 Tue May 10 23:48:41 2011 [pid 3] [ftpuser] OK DOWNLOAD: Client 127.0.0.1, /hfsmnt/upload/counter.txt, 10 bytes, 0.42Kbyte/sec Tue May 10 23:49:24 2011 [pid 3] [ftpuser] FAIL UPLOAD: Client 127.0.0.1, /hfsmnt/upload/counter1.txt, 0.00Kbyte/sec Error in Namenode Log (I did a ftp GET on counter.txt and PUT with counter1.txt) === 2011-05-11 01:03:02,822 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=ftpuser ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:02,825 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:20,275 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root ip=/10.32.77.36 cmd=listStatus src=/upload dst=nullperm=null 2011-05-11 01:03:20,290 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=ftpuser ip=/10.32.77.36 cmd=opensrc=/upload/counter.txt dst=null perm=null 2011-05-11 01:03:31,115 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 2011-05-11 01:03:31,115 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000, call append(/upload/counter1.txt, DFSClient_1590956638) from 10.32.77.36:56454: error: java.io.FileNotFoundException: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 java.io.FileNotFoundException: failed to append to non-existent file /upload/counter1.txt on client 10.32.77.36 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1166) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1336) at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:596) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1409) No activity shows up in datanode logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1912) Update tests for FsShell standardized error messages
[ https://issues.apache.org/jira/browse/HDFS-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1912: - Resolution: Fixed Fix Version/s: 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I have committed this. Thanks, Daryn! Also thanks Aaron for reviewing it. Update tests for FsShell standardized error messages Key: HDFS-1912 URL: https://issues.apache.org/jira/browse/HDFS-1912 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1912.patch Need to update the FsShell based tests for commonized error messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1865) Share LeaseChecker thread among DFSClients
[ https://issues.apache.org/jira/browse/HDFS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031912#comment-13031912 ] Hudson commented on HDFS-1865: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) Share LeaseChecker thread among DFSClients -- Key: HDFS-1865 URL: https://issues.apache.org/jira/browse/HDFS-1865 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1865_20110503.patch, h1865_20110504.patch, h1865_20110508.patch, h1865_20110508b.patch, h1865_20110509.patch Each {{DFSClient}} runs a {{LeaseChecker}} thread within a JVM. The number threads could be reduced by sharing the threads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1906) Remove logging exception stack trace when one of the datanode targets to read from is not reachable
[ https://issues.apache.org/jira/browse/HDFS-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031913#comment-13031913 ] Hudson commented on HDFS-1906: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) Remove logging exception stack trace when one of the datanode targets to read from is not reachable --- Key: HDFS-1906 URL: https://issues.apache.org/jira/browse/HDFS-1906 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.20.203.1 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1906.2.patch, HDFS-1906.patch When client fails to connect to one of the datanodes from the list of block locations returned, exception stack trace is printed in the client log. This is an expected failure scenario that is handled at the client, by going to the next location. Printing entire stack trace is unnecessary and just printing the exception message should be sufficient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1827) TestBlockReplacement waits forever, errs without giving information
[ https://issues.apache.org/jira/browse/HDFS-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031914#comment-13031914 ] Hudson commented on HDFS-1827: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) TestBlockReplacement waits forever, errs without giving information --- Key: HDFS-1827 URL: https://issues.apache.org/jira/browse/HDFS-1827 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Matt Foley Assignee: Matt Foley Fix For: 0.23.0 Attachments: 1827_TestBlockReplacement_v2.patch, 1827_TestBlockReplacement_v2.patch, 1827_TestBlockReplacement_v3.patch, TestBlockReplacement.java.patch In method checkBlocks(), TestBlockReplacement waits forever on a condition. Failures result in Hudson/Jenkins Timeout occurred error message with no information about where or why. Need to replace with TimeoutException that throws a stack trace and useful info about the failure mode. Also investigate possible cause of failure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1912) Update tests for FsShell standardized error messages
[ https://issues.apache.org/jira/browse/HDFS-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031916#comment-13031916 ] Hudson commented on HDFS-1912: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) HDFS-1912. Update tests for FsShell standardized error messages. Contributed by Daryn Sharp Update tests for FsShell standardized error messages Key: HDFS-1912 URL: https://issues.apache.org/jira/browse/HDFS-1912 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1912.patch Need to update the FsShell based tests for commonized error messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1902) Fix path display for setrep
[ https://issues.apache.org/jira/browse/HDFS-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031919#comment-13031919 ] Hudson commented on HDFS-1902: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) Fix path display for setrep --- Key: HDFS-1902 URL: https://issues.apache.org/jira/browse/HDFS-1902 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1902.patch See HDFS-1901. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1544) Ivy resolve force mode should be turned off by default
[ https://issues.apache.org/jira/browse/HDFS-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031918#comment-13031918 ] Hudson commented on HDFS-1544: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) Ivy resolve force mode should be turned off by default -- Key: HDFS-1544 URL: https://issues.apache.org/jira/browse/HDFS-1544 Project: Hadoop HDFS Issue Type: Bug Reporter: Luke Lu Assignee: Luke Lu Fix For: 0.22.0 Attachments: hdfs-1544-trunk-v1.patch, hdfs-1544-trunk-v2.patch cf. HADOOP-7068 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1873) Federation Cluster Management Web Console
[ https://issues.apache.org/jira/browse/HDFS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031917#comment-13031917 ] Hudson commented on HDFS-1873: -- Integrated in Hadoop-Hdfs-trunk-Commit #637 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/637/]) Federation Cluster Management Web Console - Key: HDFS-1873 URL: https://issues.apache.org/jira/browse/HDFS-1873 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: ClusterSummary.png, Decommission.png, HDFS-1873.2.patch, HDFS-1873.3.patch, HDFS-1873.4.patch, HDFS-1873.patch The Federation cluster management console provides # Cluster summary information that shows overall cluster utilization. A list of the name nodes that reports the used space, files and directories, blocks, live and dead datanodes of each name space. # decommissioning status of all the data nodes who are decommissioning in process or decommissioned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HDFS-1332: Assignee: Ted Yu When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Carey reassigned HDFS-1615: - Assignee: Scott Carey (was: Harsh J Chouraria) seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1918) DataXceiver double logs every IOE out of readBlock
DataXceiver double logs every IOE out of readBlock -- Key: HDFS-1918 URL: https://issues.apache.org/jira/browse/HDFS-1918 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Jean-Daniel Cryans Priority: Trivial Fix For: 0.22.0 DataXceiver will log an IOE twice because opReadBlock() will catch it, log a WARN, then throw it again only to be caught in run() as a Throwable and logged as an ERROR. As far as I can tell all the information is the same in both messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Carey updated HDFS-1615: -- Attachment: HDFS-1615.patch Patch contains simple fix, and unit test that fails without the fix. seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Carey updated HDFS-1615: -- Status: Patch Available (was: Open) seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031964#comment-13031964 ] Todd Lipcon commented on HDFS-1615: --- +1 pending Hudson test results seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1062) Improve error messages for failed completeFile
[ https://issues.apache.org/jira/browse/HDFS-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh resolved HDFS-1062. -- Resolution: Duplicate According to Todd, this is already completed by HDFS-1141 Improve error messages for failed completeFile -- Key: HDFS-1062 URL: https://issues.apache.org/jira/browse/HDFS-1062 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client, name-node Reporter: Todd Lipcon Assignee: Jonathan Hsieh Labels: newbie In practice I often see users confused by the cryptic error message failed to complete PATH because dir.getFileBlocks() is null and pendingFile is null (I wonder why!) The most common cause of this seems to be that another user deleted the file (or its containing directory) while the writer was in progress. We should at least improve the error message on the NN side. Even better would be to expose the error message through the IOException passed over the RPC boundary to the client. Including a message like (another client may have removed the file or its containing directory) should do the trick. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J Chouraria updated HDFS-1615: Attachment: HDFS-1615.r1.diff Patch that makes closed inp-stream throw IOE instead. And a new test case for DFSInputStream (I could not find any existing one, so let me know if there is indeed one that covers general open/read/close stuff - I will update patch gladly.) seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch, HDFS-1615.r1.diff After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031969#comment-13031969 ] Harsh J Chouraria commented on HDFS-1615: - Ooh! Darn I did not notice it got re-assigned, sorry 'bout that. Carry on Scott! seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch, HDFS-1615.r1.diff After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031975#comment-13031975 ] Harsh J Chouraria commented on HDFS-1615: - Ack, Scott, can you resubmit your file? I believe mine would end up conflicting the Hudson build. I'm unable to figure out how to delete it. Sorry again for not noticing the reassign man! seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031977#comment-13031977 ] Harsh J Chouraria commented on HDFS-1615: - Ah ok, I deleted my patch. My patch was this anyway, for reference: https://github.com/QwertyManiac/hadoop-hdfs/commit/fc7ac93ed5428651b63c8a6315cf6815303b4222 seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031988#comment-13031988 ] Scott Carey commented on HDFS-1615: --- No problem -- JIRA race condition. Your error message string is better than mine. I could not find any more specific testing around this either, but put the new test in an already existing class. seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031990#comment-13031990 ] Hadoop QA commented on HDFS-1621: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478853/hdfs-1621.txt against trunk revision 1102005. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/482//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/482//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/482//console This message is automatically generated. Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031991#comment-13031991 ] Todd Lipcon commented on HDFS-1621: --- oops, sorry, guess I wasn't clear in the description of this issue. It should be hadoop-common-${hadoop-common.version} -- not just ${hadoop-common.version}. Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1621: -- Status: Open (was: Patch Available) Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031992#comment-13031992 ] Scott Carey commented on HDFS-1615: --- Actually, I think it was my fault -- I thought it was unassigned when I assigned it to me. Sorry! seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Attachment: HDFS-1332.patch throwNotEnoughReplicasExceptionWithReason() composes the string containing flattened (data node, reason string) pairs. It returns DatanodeDescriptor so that the following method compiles: {code} private DatanodeDescriptor chooseRandom( String nodes, HashMapNode, Node excludedNodes, long blocksize, int maxNodesPerRack, ListDatanodeDescriptor results) {code} isGoodTarget() now returns reason string if the data node isn't a good target. Otherwise null is returned. When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032031#comment-13032031 ] Hadoop QA commented on HDFS-1615: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478861/HDFS-1615.patch against trunk revision 1102005. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/484//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/484//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/484//console This message is automatically generated. seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1903) Fix path display for rm/rmr
[ https://issues.apache.org/jira/browse/HDFS-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-1903: -- Attachment: HDFS-1903.patch update tests to match patch to common Fix path display for rm/rmr --- Key: HDFS-1903 URL: https://issues.apache.org/jira/browse/HDFS-1903 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1903.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032038#comment-13032038 ] Hadoop QA commented on HDFS-1615: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478862/HDFS-1615.r1.diff against trunk revision 1102005. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/483//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/483//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/483//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/483//console This message is automatically generated. seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1903) Fix path display for rm/rmr
[ https://issues.apache.org/jira/browse/HDFS-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-1903: -- Status: Patch Available (was: Open) Fix path display for rm/rmr --- Key: HDFS-1903 URL: https://issues.apache.org/jira/browse/HDFS-1903 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1903.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1918) DataXceiver double logs every IOE out of readBlock
[ https://issues.apache.org/jira/browse/HDFS-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032041#comment-13032041 ] Jean-Daniel Cryans commented on HDFS-1918: -- Oh looking at trunk it's going to be printed a 3rd time in BlockSender.sendChunks: {code} LOG.error(BlockSender.sendChunks() exception: + StringUtils.stringifyException(e)); {code} DataXceiver double logs every IOE out of readBlock -- Key: HDFS-1918 URL: https://issues.apache.org/jira/browse/HDFS-1918 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Jean-Daniel Cryans Priority: Trivial Fix For: 0.22.0 DataXceiver will log an IOE twice because opReadBlock() will catch it, log a WARN, then throw it again only to be caught in run() as a Throwable and logged as an ERROR. As far as I can tell all the information is the same in both messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1022) Merge under-10-min tests specs into one file
[ https://issues.apache.org/jira/browse/HDFS-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved HDFS-1022. --- Resolution: Fixed Fix Version/s: 0.20.203.0 Resolving as fixed. This jira doesn't apply post project split. Merge under-10-min tests specs into one file Key: HDFS-1022 URL: https://issues.apache.org/jira/browse/HDFS-1022 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.1 Reporter: Erik Steffl Assignee: Erik Steffl Fix For: 0.20.203.0, 0.20.1 Attachments: jira.HDFS-1022.branch-0.20.1xx.patch Build target test-commit test target invokes macro-test-runner three times with three different files. This is a problem because macro-test-runner deletes logs before each run. The proposed solution is to merge all tests (common, hdfs, mapred) into one files since it doesn't seem to be possible to call macro-test-runner with three files as argument (or to change macro-test-runner to make it possible). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jolly Chen updated HDFS-1621: - Attachment: hdfs-1621-2.txt Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621-2.txt, hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1615) seek() on closed DFS input stream throws NPE
[ https://issues.apache.org/jira/browse/HDFS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1615: -- Resolution: Fixed Fix Version/s: 0.22.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1. Committed Scott's patch to trunk and branch-0.22. Thanks Scott and Harsh! seek() on closed DFS input stream throws NPE Key: HDFS-1615 URL: https://issues.apache.org/jira/browse/HDFS-1615 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Scott Carey Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1615.patch After closing an input stream on DFS, seeking slightly ahead of the last read will throw an NPE: java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:749) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:42) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HDFS-1013) Miscellaneous improvements to HTML markup for web UIs
[ https://issues.apache.org/jira/browse/HDFS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-1013 started by Eugene Koontz. Miscellaneous improvements to HTML markup for web UIs - Key: HDFS-1013 URL: https://issues.apache.org/jira/browse/HDFS-1013 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Eugene Koontz Priority: Minor Labels: newbie The Web UIs have various bits of bad markup (eg missing head sections, some pages missing CSS links, inconsistent td vs th for table headings). We should fix this up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jolly Chen updated HDFS-1621: - Status: Patch Available (was: Open) Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621-2.txt, hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1378: -- Attachment: hdfs-1378.2.txt When looking at the patch before commit I noticed one small problem -- if the error occurs in the first 4 opcodes, it wouldn't print the offsets as part of the error message. Attached patch fixes this. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1464) Fix reporting of 2NN address when dfs.secondary.http.address is default (wildcard)
[ https://issues.apache.org/jira/browse/HDFS-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032095#comment-13032095 ] Eli Collins commented on HDFS-1464: --- Hey Jakob, Would a patch that makes the 2NN not start with 0.0.0.0 if security is on be acceptable to you? Thanks, Eli Fix reporting of 2NN address when dfs.secondary.http.address is default (wildcard) -- Key: HDFS-1464 URL: https://issues.apache.org/jira/browse/HDFS-1464 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1464.txt HDFS-1080 broke the way that the 2NN identifies its own hostname to the NN during checkpoint upload. It used to use the local hostname, which as HDFS-1080 pointed out was error prone if it had multiple interfaces, etc. But now, with the default setting of dfs.secondary.http.address, the 2NN reports 0.0.0.0, which won't work either. We should look for the wildcard bind address and use the local hostname in that case, like we used to. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1897) Documention refers to removed option dfs.network.script
[ https://issues.apache.org/jira/browse/HDFS-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Whang updated HDFS-1897: --- Attachment: HDFS-1897.patch cluster_setup.xml looks correct. Attached is the hdfs_user_guide change. Documention refers to removed option dfs.network.script --- Key: HDFS-1897 URL: https://issues.apache.org/jira/browse/HDFS-1897 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Ari Rabkin Priority: Minor Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1897.patch The HDFS user guide tells users to use dfs.network.script for rack awareness. In fact, this option has been removed and using it will trigger a fatal error on DataNode startup. Documentation should describe the current rack awareness configuration system. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Attachment: (was: HDFS-1332.patch) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Attachment: HDFS-1332.patch Added null pointer check for nodeDescToReasonMap When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1903) Fix path display for rm/rmr
[ https://issues.apache.org/jira/browse/HDFS-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032103#comment-13032103 ] Hadoop QA commented on HDFS-1903: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478878/HDFS-1903.patch against trunk revision 1102005. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/485//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/485//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/485//console This message is automatically generated. Fix path display for rm/rmr --- Key: HDFS-1903 URL: https://issues.apache.org/jira/browse/HDFS-1903 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1903.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1918) DataXceiver double logs every IOE out of readBlock
[ https://issues.apache.org/jira/browse/HDFS-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HDFS-1918: - Attachment: HDFS-1918.patch It took me some time to understand what was going on, looks like there's a lot of cruft, so here's what this patch does: - BlockSender doesn't convert pure IOEs to SocketExceptions because HDFS-1527 showed that it may hide other issues (and that's why more logging of the exceptions was added). - Because of that, there's no need to catch the IOEs at all so I refactored the try/catches out. - DataXceiver now cares about all the IOEs coming out. Like it says in the comments it used to shutdown so my guess (which could be confirmed by digging in years of SVN commits) is that the SocketException trickery was made so that it wouldn't shut the DN down for normal reasons. Now that the code doesn't do that anymore, there's no reason to hack the IOEs into SocketExceptions (which was bad to begin with). I also added a comment to the others to describe the current state. - opReadBlock doesn't throw the IOE anymore so that run() doesn't print it too. I don't see any way this could change the behavior of the class. I didn't add any unit test since no behavior was changed. DataXceiver double logs every IOE out of readBlock -- Key: HDFS-1918 URL: https://issues.apache.org/jira/browse/HDFS-1918 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Jean-Daniel Cryans Priority: Trivial Fix For: 0.22.0 Attachments: HDFS-1918.patch DataXceiver will log an IOE twice because opReadBlock() will catch it, log a WARN, then throw it again only to be caught in run() as a Throwable and logged as an ERROR. As far as I can tell all the information is the same in both messages. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1919) Upgrade to federated namespace fails
Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.23.0 I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032110#comment-13032110 ] Ted Yu commented on HDFS-1332: -- When I ran TestDecommission.testDecommission(), I saw the contents of nodeDescToReasonMap: {code} {127.0.0.1:49443=Node /default-rack/127.0.0.1:49443 is not chosen because the node is (being) decommissioned} {code} in: {code} BlockPlacementPolicyDefault.chooseRandom(int, String, HashMapNode,Node, long, int, ListDatanodeDescriptor) line: 386 {code} When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1919: -- Attachment: hdfs-1919.txt Patch to fix LAST_PRE_FEDERATION_LAYOUT_VERSION Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1919.txt I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1919: -- Status: Patch Available (was: Open) Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1919.txt I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-1919: - Assignee: Todd Lipcon Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1919.txt I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032113#comment-13032113 ] Todd Lipcon commented on HDFS-1919: --- I patched too soon... this allows the upgrade and initial startup. But, then, on next start, the NN fails with: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has no block pool Id. Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1919.txt I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1903) Fix path display for rm/rmr
[ https://issues.apache.org/jira/browse/HDFS-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032116#comment-13032116 ] Daryn Sharp commented on HDFS-1903: --- These test changes correspond to changes in HADOOP-7267. The TestHDFSCLI and TestDFSShell errors occurred because the build started before the hadoop bug was committed. The flux in these tests should wind down in a few days. Fix path display for rm/rmr --- Key: HDFS-1903 URL: https://issues.apache.org/jira/browse/HDFS-1903 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1903.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1579) Browsing a block on the wrong datanode through web UI has unintuitive results
[ https://issues.apache.org/jira/browse/HDFS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-1579: - Assignee: Omer Trajman Browsing a block on the wrong datanode through web UI has unintuitive results - Key: HDFS-1579 URL: https://issues.apache.org/jira/browse/HDFS-1579 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Omer Trajman Labels: newbie If you navigate through the web UI to view a file, and then the replica you're looking at is removed, the Got error in response to OP_READ_BLOCK message is printed in the content of the textarea that usually displays the text. This is confusing to users. Instead we should display the error message in a different format and direct the user to choose a different replica from the list of links at the bottom of the page. Or, alternatively, do an HTTP redirect to a different replica. We should also take care of the case where the generation stamp has a mismatch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories
[ https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032118#comment-13032118 ] John George commented on HDFS-1869: --- +1. looks good. mkdirs should use the supplied permission for all of the created directories Key: HDFS-1869 URL: https://issues.apache.org/jira/browse/HDFS-1869 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-1869.patch Mkdirs only uses the supplied FsPermission for the last directory of the path. Paths 0..N-1 will all inherit the parent dir's permissions -even if- inheritPermission is false. This is a regression from somewhere around 0.20.9 and does not follow posix semantics. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032123#comment-13032123 ] Todd Lipcon commented on HDFS-1332: --- I think there's at least one more case worth adding explanation to. In testing this, I tried creating a file when there were no datanodes running: java.io.IOException: File /user/todd/x2 could only be replicated to 0 nodes, instead of 1 This is another case that new users run into a lot. It would be great to change the code in FSNamesystem which throws this exception to check the total number of available nodes in the system, and if it's minReplication, include that in the message. Perhaps something like: File /foo/bar could only be replicated to 0 nodes, instead of 1, because there are 0 datanodes running. Also, I tried testing this code by putting a file with a very large blocksize and my DN configured on a small partition. I got the following NPE: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.throwNotEnoughReplicasExceptionWithReason(BlockPlacementPolicyDefault.java:324) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:367) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:262) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseLocalNode(BlockPlacementPolicyDefault.java:236) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:171) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:139) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:89) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:78) at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:234) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1637) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:841) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1897) Documention refers to removed option dfs.network.script
[ https://issues.apache.org/jira/browse/HDFS-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-1897: - Assignee: Andrew Whang Documention refers to removed option dfs.network.script --- Key: HDFS-1897 URL: https://issues.apache.org/jira/browse/HDFS-1897 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Ari Rabkin Assignee: Andrew Whang Priority: Minor Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1897.patch The HDFS user guide tells users to use dfs.network.script for rack awareness. In fact, this option has been removed and using it will trigger a fatal error on DataNode startup. Documentation should describe the current rack awareness configuration system. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1897) Documention refers to removed option dfs.network.script
[ https://issues.apache.org/jira/browse/HDFS-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-1897. --- Resolution: Fixed Hadoop Flags: [Reviewed] Committed to trunk and 0.22, thanks Andrew! Documention refers to removed option dfs.network.script --- Key: HDFS-1897 URL: https://issues.apache.org/jira/browse/HDFS-1897 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Ari Rabkin Assignee: Andrew Whang Priority: Minor Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1897.patch The HDFS user guide tells users to use dfs.network.script for rack awareness. In fact, this option has been removed and using it will trigger a fatal error on DataNode startup. Documentation should describe the current rack awareness configuration system. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1920) libhdfs does not build for ARM processors
libhdfs does not build for ARM processors - Key: HDFS-1920 URL: https://issues.apache.org/jira/browse/HDFS-1920 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs Affects Versions: 0.21.0 Environment: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/arm-linux-gnueabi/gcc/arm-linux-gnueabi/4.5.2/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-8ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=arm-linux-gnueabi --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib/arm-linux-gnueabi --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib/arm-linux-gnueabi --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabi --host=arm-linux-gnueabi --target=arm-linux-gnueabi Thread model: posix gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) $ uname -a Linux panda0 2.6.38-1002-linaro-omap #3-Ubuntu SMP Fri Apr 15 14:00:54 UTC 2011 armv7l armv7l armv7l GNU/Linux Reporter: Trevor Robinson $ ant compile -Dcompile.native=true -Dcompile.c++=1 -Dlibhdfs=1 -Dfusedfs=1 ... create-libhdfs-configure: ... [exec] configure: error: Unsupported CPU architecture armv7l Once the CPU arch check is fixed in src/c++/libhdfs/m4/apsupport.m4, then next issue is -m32: $ ant compile -Dcompile.native=true -Dcompile.c++=1 -Dlibhdfs=1 -Dfusedfs=1 ... compile-c++-libhdfs: [exec] /bin/bash ./libtool --tag=CC --mode=compile gcc -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -Dsize_t=unsigned\ int -Dconst=/\*\*/ -Dvolatile=/\*\*/ -I. -I/home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ -m32 -I/usr/lib/jvm/java-6-openjdk/include -I/usr/lib/jvm/java-6-openjdk/include/arm -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c -o hdfs.lo /home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs/hdfs.c [exec] make: Warning: File `.deps/hdfs_write.Po' has modification time 2.1 s in the future [exec] libtool: compile: gcc -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -Dsize_t=unsigned int -Dconst=/**/ -Dvolatile=/**/ -I. -I/home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ -m32 -I/usr/lib/jvm/java-6-openjdk/include -I/usr/lib/jvm/java-6-openjdk/include/arm -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c /home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs/hdfs.c -fPIC -DPIC -o .libs/hdfs.o [exec] cc1: error: unrecognized command line option -m32 [exec] make: *** [hdfs.lo] Error 1 Here, gcc does not support -m32 for the ARM target, so -m${JVM_ARCH} must be omitted from CFLAGS and LDFLAGS. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1920) libhdfs does not build for ARM processors
[ https://issues.apache.org/jira/browse/HDFS-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trevor Robinson updated HDFS-1920: -- Attachment: hadoop-hdfs-arm.patch This patch adds a case for ARM to src/c++/libhdfs/m4/apsupport.m4 and omits -m${JVM_ARCH} from CFLAGS and LDFLAGS if $host_cpu starts with arm. libhdfs does not build for ARM processors - Key: HDFS-1920 URL: https://issues.apache.org/jira/browse/HDFS-1920 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs Affects Versions: 0.21.0 Environment: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/arm-linux-gnueabi/gcc/arm-linux-gnueabi/4.5.2/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-8ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=arm-linux-gnueabi --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib/arm-linux-gnueabi --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib/arm-linux-gnueabi --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabi --host=arm-linux-gnueabi --target=arm-linux-gnueabi Thread model: posix gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) $ uname -a Linux panda0 2.6.38-1002-linaro-omap #3-Ubuntu SMP Fri Apr 15 14:00:54 UTC 2011 armv7l armv7l armv7l GNU/Linux Reporter: Trevor Robinson Attachments: hadoop-hdfs-arm.patch $ ant compile -Dcompile.native=true -Dcompile.c++=1 -Dlibhdfs=1 -Dfusedfs=1 ... create-libhdfs-configure: ... [exec] configure: error: Unsupported CPU architecture armv7l Once the CPU arch check is fixed in src/c++/libhdfs/m4/apsupport.m4, then next issue is -m32: $ ant compile -Dcompile.native=true -Dcompile.c++=1 -Dlibhdfs=1 -Dfusedfs=1 ... compile-c++-libhdfs: [exec] /bin/bash ./libtool --tag=CC --mode=compile gcc -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -Dsize_t=unsigned\ int -Dconst=/\*\*/ -Dvolatile=/\*\*/ -I. -I/home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ -m32 -I/usr/lib/jvm/java-6-openjdk/include -I/usr/lib/jvm/java-6-openjdk/include/arm -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c -o hdfs.lo /home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs/hdfs.c [exec] make: Warning: File `.deps/hdfs_write.Po' has modification time 2.1 s in the future [exec] libtool: compile: gcc -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -Dsize_t=unsigned int -Dconst=/**/ -Dvolatile=/**/ -I. -I/home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ -m32 -I/usr/lib/jvm/java-6-openjdk/include -I/usr/lib/jvm/java-6-openjdk/include/arm -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c /home/trobinson/dev/hadoop-hdfs/src/c++/libhdfs/hdfs.c -fPIC -DPIC -o .libs/hdfs.o [exec] cc1: error: unrecognized command line option -m32 [exec] make: *** [hdfs.lo] Error 1 Here, gcc does not support -m32 for the ARM target, so -m${JVM_ARCH} must be omitted from CFLAGS and LDFLAGS. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032156#comment-13032156 ] Hadoop QA commented on HDFS-1621: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478883/hdfs-1621-2.txt against trunk revision 1102094. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/486//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/486//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/486//console This message is automatically generated. Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621-2.txt, hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032157#comment-13032157 ] Ted Yu commented on HDFS-1332: -- I added a new test in TestDecommission: {code} @Test public void testDecommissionForReasonableExceptionMsg() throws IOException { startCluster(1, 2, conf); FileSystem fileSys = cluster.getFileSystem(0); // Decommission one node. Verify that node is decommissioned. DatanodeInfo decomNode = decommissionNode(0, null, AdminStates.DECOMMISSIONED); Path file1 = new Path(testDecommission.dat); try { writeFile(fileSys, file1, 2); } catch (Exception nere) { String msg = nere.getMessage(); assertTrue(Expected detail about not enough replicas, msg.contains(is not chosen)); } } {code} But the NotEnoughReplicasException is caught by chooseTarget(). At least I saw the following: {code} 11/05/11 15:54:58 WARN namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 to reach 2 Not able to place enough replicas.[127.0.0.1:50055: Node /default-rack/127.0.0.1:50055 is not chosen because the node is (being) decommissioned ] {code} When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032161#comment-13032161 ] Hadoop QA commented on HDFS-1378: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478890/hdfs-1378.2.txt against trunk revision 1102094. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/487//console This message is automatically generated. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032162#comment-13032162 ] Todd Lipcon commented on HDFS-1332: --- I'm referring to this code in FSNamesystem: {code} DatanodeDescriptor targets[] = blockManager.replicator.chooseTarget( src, replication, clientNode, excludedNodes, blockSize); if (targets.length blockManager.minReplication) { throw new IOException(File + src + could only be replicated to + targets.length + nodes, instead of + blockManager.minReplication); } {code} When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart
Save namespace can cause NN to be unable to come up on restart -- Key: HDFS-1921 URL: https://issues.apache.org/jira/browse/HDFS-1921 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Priority: Critical Fix For: 0.22.0, 0.23.0 I discovered this in the course of trying to implement a fix for HDFS-1505. Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save namespace proceeds in the following order: # rename current to lastcheckpoint.tmp for all of them, # save image and recreate edits for all of them, # rename lastcheckpoint.tmp to previous.checkpoint. The problem is that step 3 occurs regardless of whether or not an error occurs for all storage directories in step 2. Upon restart, the NN will see non-existent or corrupt {{current}} directories, and no {{lastcheckpoint.tmp}} directories, and so will conclude that the storage directories are not formatted. This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23 blocker. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1919) Upgrade to federated namespace fails
[ https://issues.apache.org/jira/browse/HDFS-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032170#comment-13032170 ] Hadoop QA commented on HDFS-1919: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478896/hdfs-1919.txt against trunk revision 1102094. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/488//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/488//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/488//console This message is automatically generated. Upgrade to federated namespace fails Key: HDFS-1919 URL: https://issues.apache.org/jira/browse/HDFS-1919 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1919.txt I formatted a namenode running off 0.22 branch, and trying to start it on trunk yields: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/name1 is in an inconsistent state: file VERSION has clusterID mising. It looks like 0.22 has LAYOUT_VERSION -33, but trunk has LAST_PRE_FEDERATION_LAYOUT_VERSION = -30, which is incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1505: - Attachment: hdfs-1505-22.0.patch Patch for 0.22. Note that I commented-out a section of Todd's original test since it won't pass until HDFS-1921 is fixed. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-test.txt After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Attachment: (was: HDFS-1332.patch) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Attachment: HDFS-1332.patch Enhanced exception message for getAdditionalBlock() as suggested by Todd. When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
[ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-1332: - Fix Version/s: 0.23.0 Status: Patch Available (was: Open) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded -- Key: HDFS-1332 URL: https://issues.apache.org/jira/browse/HDFS-1332 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Todd Lipcon Assignee: Ted Yu Priority: Minor Labels: newbie Fix For: 0.23.0 Attachments: HDFS-1332.patch Whenever the block placement policy determines that a node is not a good target it could add the reason for exclusion to a list, and then when we log Not able to place enough replicas we could say why each node was refused. This would help new users who are having issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right now it's very difficult to figure out the issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1505: - Attachment: hdfs-1505-trunk.0.patch Patch for trunk. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032184#comment-13032184 ] Aaron T. Myers commented on HDFS-1505: -- Jakob, do you mind if I reassign this JIRA to myself? saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart
[ https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1921: -- Priority: Blocker (was: Critical) I agree this is a blocker Save namespace can cause NN to be unable to come up on restart -- Key: HDFS-1921 URL: https://issues.apache.org/jira/browse/HDFS-1921 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Priority: Blocker Fix For: 0.22.0, 0.23.0 I discovered this in the course of trying to implement a fix for HDFS-1505. Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save namespace proceeds in the following order: # rename current to lastcheckpoint.tmp for all of them, # save image and recreate edits for all of them, # rename lastcheckpoint.tmp to previous.checkpoint. The problem is that step 3 occurs regardless of whether or not an error occurs for all storage directories in step 2. Upon restart, the NN will see non-existent or corrupt {{current}} directories, and no {{lastcheckpoint.tmp}} directories, and so will conclude that the storage directories are not formatted. This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23 blocker. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032186#comment-13032186 ] Todd Lipcon commented on HDFS-1505: --- +if (storage.getNumStorageDirs(NameNodeDirType.IMAGE) == 0 +storage.getNumStorageDirs(NameNodeDirType.EDITS) == 0 +storage.getNumStorageDirs(NameNodeDirType.IMAGE_AND_EDITS) == 0) { + throw new IOException(Failed to save any storage directories while saving namespace); I think the check for .EDITS is incorrect -- if we lose all the IMAGE and IMAGE_AND_EDITS dirs, but still have EDITS dirs, we're still screwed. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1505: - Attachment: hdfs-1505-trunk.1.patch hdfs-1505-22.1.patch Thanks a lot for the review, Todd. I've attached updated patches incorporating your comments. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-22.1.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032192#comment-13032192 ] Aaron T. Myers commented on HDFS-1378: -- Updated patch looks good to me. Thanks for catching that, Todd. Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection
[ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032194#comment-13032194 ] Jason Rutherglen commented on HDFS-941: --- I'm seeing many errors trying to apply http://issues.apache.org/jira/secure/attachment/12476027/HDFS-941-6.patch to https://svn.apache.org/repos/asf/hadoop/hdfs/trunk Datanode xceiver protocol should allow reuse of a connection Key: HDFS-941 URL: https://issues.apache.org/jira/browse/HDFS-941 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: bc Wong Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch, hdfs941-1.png Right now each connection into the datanode xceiver only processes one operation. In the case that an operation leaves the stream in a well-defined state (eg a client reads to the end of a block successfully) the same connection could be reused for a second operation. This should improve random read performance significantly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1378) Edit log replay should track and report file offsets in case of errors
[ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1378: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to trunk. Thanks Aaron! Edit log replay should track and report file offsets in case of errors -- Key: HDFS-1378 URL: https://issues.apache.org/jira/browse/HDFS-1378 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch, hdfs-1378.1.patch, hdfs-1378.2.txt Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1505: -- Status: Patch Available (was: Open) saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-22.1.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-1505: - Assignee: Aaron T. Myers (was: Jakob Homan) saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-22.0.patch, hdfs-1505-22.1.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
Recurring failure in TestJMXGet.testNameNode since build 477 on May 11 -- Key: HDFS-1922 URL: https://issues.apache.org/jira/browse/HDFS-1922 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Matt Foley -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated
[ https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Mundlapudi updated HDFS-1592: - Attachment: HDFS-1592-1.patch Attaching the patch for 0.23 version. Datanode startup doesn't honor volumes.tolerated - Key: HDFS-1592 URL: https://issues.apache.org/jira/browse/HDFS-1592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.20.204.0, 0.23.0 Attachments: HDFS-1592-1.patch, HDFS-1592-rel20.patch Datanode startup doesn't honor volumes.tolerated for hadoop 20 version. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1923) Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29
Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29 Key: HDFS-1923 URL: https://issues.apache.org/jira/browse/HDFS-1923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Matt Foley -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1923) Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29
[ https://issues.apache.org/jira/browse/HDFS-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032209#comment-13032209 ] Matt Foley commented on HDFS-1923: -- Test case TestFiDataTransferProtocol2.pipeline_Fi_29() has failed in builds 409, 425, 455, 460, 464, 469, 481, 483, 484. Failure mode varies. Intermittent recurring failure in TestFiDataTransferProtocol2.pipeline_Fi_29 Key: HDFS-1923 URL: https://issues.apache.org/jira/browse/HDFS-1923 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Matt Foley -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)
[ https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032211#comment-13032211 ] Tanping Wang commented on HDFS-1117: Just a quick note that TestJMXGet is failing on HDFS trunk since yesterday. This patch should fix the failutre. Expecting test failure of TestJMXGet to be fixed after this patch. HDFS portion of HADOOP-6728 (ovehaul metrics framework) --- Key: HDFS-1117 URL: https://issues.apache.org/jira/browse/HDFS-1117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Luke Lu Assignee: Luke Lu Fix For: 0.23.0 Attachments: HDFS-1117.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1621) Fix references to hadoop-common-${version} in build.xml
[ https://issues.apache.org/jira/browse/HDFS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032212#comment-13032212 ] Todd Lipcon commented on HDFS-1621: --- +1 Fix references to hadoop-common-${version} in build.xml --- Key: HDFS-1621 URL: https://issues.apache.org/jira/browse/HDFS-1621 Project: Hadoop HDFS Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon Labels: newbie Attachments: hdfs-1621-2.txt, hdfs-1621.txt Similar to MAPREDUCE-2315, we should fix any references to the hadoop common jar that use ${version} instead of ${hadoop-common.version}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira