[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932971#comment-13932971 ] Akira AJISAKA commented on HDFS-5978: - [~wheat9], thank you for your comment. I updated the patch to reflect the comment, and added {{FILESTATUS}} operation. bq. I'm also unclear why {{StringEncoder}} is required. It is required because the type of the json content is {{String}}. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-5978: Attachment: HDFS-5978.2.patch Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933009#comment-13933009 ] Hadoop QA commented on HDFS-6097: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634324/HDFS-6097.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6389//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6389//console This message is automatically generated. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933084#comment-13933084 ] Hadoop QA commented on HDFS-5978: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634387/HDFS-5978.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6390//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6390//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6390//console This message is automatically generated. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout
[ https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933107#comment-13933107 ] Hudson commented on HDFS-6096: -- FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/508/]) HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java TestWebHdfsTokens may timeout - Key: HDFS-6096 URL: https://issues.apache.org/jira/browse/HDFS-6096 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 3.0.0, 2.4.0 Attachments: h6096_20140312.patch The timeout of TestWebHdfsTokens is set to 1 second. It is too short for some machines. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work
[ https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933105#comment-13933105 ] Hudson commented on HDFS-6079: -- FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/508/]) HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed by Andrew Wang. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java Timeout for getFileBlockStorageLocations does not work -- Key: HDFS-6079 URL: https://issues.apache.org/jira/browse/HDFS-6079 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.4.0 Attachments: hdfs-6079-1.patch {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value which lets clients set a timeout, but it's not being enforced correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933110#comment-13933110 ] Hudson commented on HDFS-5705: -- FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/508/]) HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to branch-2 and branch-2.4. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException Key: HDFS-5705 URL: https://issues.apache.org/jira/browse/HDFS-5705 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0 Reporter: Ted Yu Assignee: Ted Yu Fix For: 3.0.0, 2.4.0 Attachments: hdfs-5705.html, hdfs-5705.txt From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/ : {code} java.util.ConcurrentModificationException: null at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116) {code} The above happens when shutdown() is called in parallel to addBlockPool() or shutdownBlockPool(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work
[ https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933243#comment-13933243 ] Hudson commented on HDFS-6079: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/]) HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed by Andrew Wang. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java Timeout for getFileBlockStorageLocations does not work -- Key: HDFS-6079 URL: https://issues.apache.org/jira/browse/HDFS-6079 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.4.0 Attachments: hdfs-6079-1.patch {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value which lets clients set a timeout, but it's not being enforced correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933248#comment-13933248 ] Hudson commented on HDFS-5705: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/]) HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to branch-2 and branch-2.4. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException Key: HDFS-5705 URL: https://issues.apache.org/jira/browse/HDFS-5705 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0 Reporter: Ted Yu Assignee: Ted Yu Fix For: 3.0.0, 2.4.0 Attachments: hdfs-5705.html, hdfs-5705.txt From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/ : {code} java.util.ConcurrentModificationException: null at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116) {code} The above happens when shutdown() is called in parallel to addBlockPool() or shutdownBlockPool(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout
[ https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933245#comment-13933245 ] Hudson commented on HDFS-6096: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/]) HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java TestWebHdfsTokens may timeout - Key: HDFS-6096 URL: https://issues.apache.org/jira/browse/HDFS-6096 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 3.0.0, 2.4.0 Attachments: h6096_20140312.patch The timeout of TestWebHdfsTokens is set to 1 second. It is too short for some machines. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work
[ https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933344#comment-13933344 ] Hudson commented on HDFS-6079: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/]) HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed by Andrew Wang. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java Timeout for getFileBlockStorageLocations does not work -- Key: HDFS-6079 URL: https://issues.apache.org/jira/browse/HDFS-6079 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.4.0 Attachments: hdfs-6079-1.patch {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value which lets clients set a timeout, but it's not being enforced correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout
[ https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933346#comment-13933346 ] Hudson commented on HDFS-6096: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/]) HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java TestWebHdfsTokens may timeout - Key: HDFS-6096 URL: https://issues.apache.org/jira/browse/HDFS-6096 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 3.0.0, 2.4.0 Attachments: h6096_20140312.patch The timeout of TestWebHdfsTokens is set to 1 second. It is too short for some machines. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933349#comment-13933349 ] Hudson commented on HDFS-5705: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/]) HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to branch-2 and branch-2.4. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException Key: HDFS-5705 URL: https://issues.apache.org/jira/browse/HDFS-5705 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0 Reporter: Ted Yu Assignee: Ted Yu Fix For: 3.0.0, 2.4.0 Attachments: hdfs-5705.html, hdfs-5705.txt From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/ : {code} java.util.ConcurrentModificationException: null at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439) at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97) at org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116) {code} The above happens when shutdown() is called in parallel to addBlockPool() or shutdownBlockPool(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation
[ https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933475#comment-13933475 ] Thanh Do commented on HDFS-6009: Hi Yu Li, I want to follow up on this issue. Could you please elaborate more on datanode failure. In particular, what caused the failure in your case? Is it a disk error, network failure, or an application is buggy? If it is a disk error and network failure, I think isolation using datanode group is reasonable. Tools based on favored node feature for isolation - Key: HDFS-6009 URL: https://issues.apache.org/jira/browse/HDFS-6009 Project: Hadoop HDFS Issue Type: Task Affects Versions: 2.3.0 Reporter: Yu Li Assignee: Yu Li Priority: Minor There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in multi-tenant deployments of HBase we prefer to specify several groups of regionservers to serve different applications, to achieve some kind of isolation or resource allocation. However, although the regionservers are grouped, the datanodes which store the data are not, which leads to the case that one datanode failure affects multiple applications, as we already observed in our product environment. To relieve the above issue, we could take usage of the favored node feature (HDFS-2576) to make regionserver able to locate data within its group, or say make datanodes also grouped (passively), to form some level of isolation. In this case, or any other case that needs datanodes to group, we would need a bunch of tools to maintain the group, including: 1. Making balancer able to balance data among specified servers, rather than the whole set 2. Set balance bandwidth for specified servers, rather than the whole set 3. Some tool to check whether the block is cross-group placed, and move it back if so This JIRA is an umbrella for the above tools. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933570#comment-13933570 ] Chris Nauroth commented on HDFS-6097: - The patch looks good, Colin. Just a few small things: # {{DFSInputStream#tryReadZeroCopy}}: It seems unnecessary to copy {{pos}} to {{curPos}}. The value of {{curPos}} is never changed throughout the method, so it's always the same as {{pos}}. This is a synchronized method, so I don't expect {{pos}} to get mutated on a different thread. # {{TestEnhancedByteBufferAccess}}: Let's remove the commented out lines and the extra indentation on the {{Assert.fail}} line. Let's use try-finally blocks to guarantee cleanup of {{cluster}}, {{fs}}, {{fsIn}} and {{fsIn2}}. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-5244: -- Summary: TestNNStorageRetentionManager#testPurgeMultipleDirs fails (was: TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order. ) TestNNStorageRetentionManager#testPurgeMultipleDirs fails - Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Assignee: Jinghui Wang Fix For: 3.0.0, 2.1.0-beta, 2.4.0 Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6038) JournalNode hardcodes NameNodeLayoutVersion in the edit log file
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933577#comment-13933577 ] Jing Zhao commented on HDFS-6038: - Thanks for the review, Todd! bq. just worried that other contributors may want to review this patch as it's actually making an edit log format change, not just a protocol change for the JNs. I will update the jira title and description to make them more clear about the changes. bq. it might be nice to add a QJM test which writes fake ops to a JournalNode Yeah, will update the patch to add the unit test. JournalNode hardcodes NameNodeLayoutVersion in the edit log file Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-5244: -- Resolution: Fixed Target Version/s: 2.4.0 (was: 3.0.0, 2.1.1-beta) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed the patch to trunk, branch-2 and branch-2.4. Thank you [~jwang302]! TestNNStorageRetentionManager#testPurgeMultipleDirs fails - Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Assignee: Jinghui Wang Fix For: 3.0.0, 2.4.0, 2.1.0-beta Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933594#comment-13933594 ] Hudson commented on HDFS-5244: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5321 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5321/]) HDFS-5244. TestNNStorageRetentionManager#testPurgeMultipleDirs fails. Contributed bye Jinghui Wang. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577254) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java TestNNStorageRetentionManager#testPurgeMultipleDirs fails - Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Assignee: Jinghui Wang Fix For: 3.0.0, 2.1.0-beta, 2.4.0 Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable
[ https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933590#comment-13933590 ] Brandon Li commented on HDFS-6080: -- +1 Improve NFS gateway performance by making rtmax and wtmax configurable -- Key: HDFS-6080 URL: https://issues.apache.org/jira/browse/HDFS-6080 Project: Hadoop HDFS Issue Type: Improvement Components: nfs, performance Reporter: Abin Shahab Assignee: Abin Shahab Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the maximum read and write capacity of the server. Therefore, these affect the read and write performance. We ran performance tests with 1mb, 100mb, and 1GB files. We noticed significant performance decline with the size increase when compared to fuse. We realized that the issue was with the hardcoded rtmax size(64k). When we increased the rtmax to 1MB, we got a 10x improvement in performance. NFS reads: +---++---+---+---++--+ | File | Size | Run 1 | Run 2 | Run 3 | Average| Std. Dev.| | testFile100Mb | 104857600 | 23.131158137 | 19.24552955 | 19.793332866 | 20.72334018435 | 1.7172094782219731 | | testFile1Gb | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 | 212.5355729113 | 8.14037175506561 | | testFile1Mb | 1048576| 0.330546906 | 0.256391808 | 0.28730168 | 0.291413464667 | 0.030412987573361663 | +---++---+---+---++--+ Fuse reads: +---++-+--+--++---+ | File | Size | Run 1 | Run 2| Run 3| Average| Std. Dev. | | testFile100Mb | 104857600 | 2.394459443 | 2.695265191 | 2.50046517 | 2.530063267997 | 0.12457410127142007 | | testFile1Gb | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 24.69662577297 | 0.386672412437576 | | testFile1Mb | 1048576| 0.271615094 | 0.270835986 | 0.271796438 | 0.271415839333 | 0.0004166483951065848 | +---++-+--+--++---+ (NFS read after rtmax = 1MB) +---++--+-+--+-+-+ | File | Size | Run 1| Run 2 | Run 3| Average | Std. Dev.| | testFile100Mb | 104857600 | 3.655261869 | 3.438676067 | 3.557464787 | 3.550467574336 | 0.0885591069882058 | | testFile1Gb | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 36.66074083135 | 1.4389615098060426 | | testFile1Mb | 1048576| 0.115602858 | 0.106826253 | 0.125229976 | 0.1158863623334 | 0.007515962395481867 | +---++--+-+--+-+-+ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6038) JournalNode hardcodes NameNodeLayoutVersion in the edit log file
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6038: Description: In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. was: In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. JournalNode hardcodes NameNodeLayoutVersion in the edit log file Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6038: Summary: Allow JournalNode to handle editlog produced by new release with future layoutversion (was: JournalNode hardcodes NameNodeLayoutVersion in the edit log file) Allow JournalNode to handle editlog produced by new release with future layoutversion - Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
[ https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated HDFS-6092: --- Attachment: HDFS-6092-v4.patch DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port -- Key: HDFS-6092 URL: https://issues.apache.org/jira/browse/HDFS-6092 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Ted Yu Attachments: HDFS-6092-v4.patch, haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt I discovered this when working on HBASE-10717 Here is sample code to reproduce the problem: {code} Path desPath = new Path(hdfs://127.0.0.1/); FileSystem desFs = desPath.getFileSystem(conf); String s = desFs.getCanonicalServiceName(); URI uri = desFs.getUri(); {code} Canonical name string contains the default port - 8020 But uri doesn't contain port. This would result in the following exception: {code} testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils) Time elapsed: 0.001 sec ERROR! java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) at java.net.InetSocketAddress.init(InetSocketAddress.java:224) at org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88) {code} Thanks to Brando Li who helped debug this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
[ https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933627#comment-13933627 ] haosdent commented on HDFS-6092: [~te...@apache.org] I upload HDFS-6092-v4.patch. Could you help me to review it? :-) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port -- Key: HDFS-6092 URL: https://issues.apache.org/jira/browse/HDFS-6092 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Ted Yu Attachments: HDFS-6092-v4.patch, haosdent-HDFS-6092-v2.patch, haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt I discovered this when working on HBASE-10717 Here is sample code to reproduce the problem: {code} Path desPath = new Path(hdfs://127.0.0.1/); FileSystem desFs = desPath.getFileSystem(conf); String s = desFs.getCanonicalServiceName(); URI uri = desFs.getUri(); {code} Canonical name string contains the default port - 8020 But uri doesn't contain port. This would result in the following exception: {code} testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils) Time elapsed: 0.001 sec ERROR! java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) at java.net.InetSocketAddress.init(InetSocketAddress.java:224) at org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88) {code} Thanks to Brando Li who helped debug this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933643#comment-13933643 ] Colin Patrick McCabe commented on HDFS-6097: bq. DFSInputStream#tryReadZeroCopy: It seems unnecessary to copy pos to curPos. The value of curPos is never changed throughout the method, so it's always the same as pos. This is a synchronized method, so I don't expect pos to get mutated on a different thread. This is actually an optimization I made. I wanted to avoid making a memory access each time, since I use this variable a lot. By copying it to a local variable, it becomes a lot more obvious to the optimizer that it can't change. It's possible that Java will perform this optimization automatically, but I'm skeptical because we're calling a lot of functions here. It seems like it would require a sophisticated optimizer to realize that there was no code path that changed this variable. bq. TestEnhancedByteBufferAccess: Let's remove the commented out lines and the extra indentation on the Assert.fail line. OK. bq. Let's use try-finally blocks to guarantee cleanup of cluster, fs, fsIn and fsIn2. I guess I've started to skip doing this on unit tests. My rationale is that if the test fails, cleanup isn't really that important (the surefire process will simply terminate). In the meantime, try... finally blocks complicate the code and often make it hard to see where a test originally failed. Oftentimes if things get messed up, {{FileSystem#close}} or {{MiniDFSCluster#shutdown}} will throw an exception. And you end up seeing this unhelpful exception rather than the root cause of the problem displayed in the maven test output. On the other hand, I suppose going without try... finally could encourage people to copy flawed code, so I guess that's the counter-argument. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933664#comment-13933664 ] Haohui Mai commented on HDFS-5978: -- bq. I updated the patch to reflect the comment, and added FILESTATUS operation. I appreciate if you can separate it into a new jira. It looks to me that the new patch has not fully addressed the previous round of comment (yet). bq. It is required because the type of the json content is String. I'm yet to be convinced. The code is dumping UTF-8 string directly to the channel buffer. I don't quite follow why you need an extra pipeline stage to dump the string into the channel buffer. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933668#comment-13933668 ] Colin Patrick McCabe commented on HDFS-6097: Thanks for the review, Chris. I'm going to put out a new version in a sec with the test cleanups, and with try... finally in the test. I guess I'll bring up the try... finally issue on the mailing list at some point, and see what people think. In the meantime, I'd like to get this in soon so we can continue testing... zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Attachment: HDFS-6097.004.patch zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Attachment: (was: HDFS-6097.004.patch) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Status: Open (was: Patch Available) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Status: Patch Available (was: Open) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Attachment: HDFS-6097.004.patch zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933818#comment-13933818 ] Chris Nauroth commented on HDFS-6097: - bq. This is actually an optimization I made. I see. Thanks for explaining. Would you mind putting a comment in there? bq. I guess I've started to skip doing this on unit tests. I got into the try-finally habit during the Windows work. On Windows, we'd have one test fail and leave the cluster running, because it wasn't doing shutdown. Then, subsequent tests also would fail during initialization due to the more pessimistic file locking behavior on Windows. The prior cluster still held locks on the test data directory, so the subsequent tests couldn't reformat. The subsequent tests would have passed otherwise, so this had the effect of disrupting full test run reports with a lot of false failures. It made it more difficult to determine exactly which test was really failing. If the stack traces from close aren't helpful, then we can stifle them by calling {{IOUtils#cleanup}} and passing a null logger. FWIW, my current favorite way to do this is cluster initialization in a {{BeforeClass}} method, cluster shutdown in an {{AfterClass}} method, and sometimes close of individual streams or file systems in an {{After}} method depending on what the test is doing. This reigns in the code clutter of try-finally. It's not always convenient though if you need to change {{Configuration}} in each test or if you need per-test isolation for some other reason. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6102) Cannot load an fsimage with a very large directory
Andrew Wang created HDFS-6102: - Summary: Cannot load an fsimage with a very large directory Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933822#comment-13933822 ] Chris Nauroth commented on HDFS-6097: - Thanks, Colin. I also meant to add that it's a bit less relevant in this patch, because we know this test won't run on Windows (at least not yet), but like you said it does set a precedent that someone could copy-paste into future tests. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933832#comment-13933832 ] Chris Nauroth commented on HDFS-6097: - Thanks for posting v4. Were you also going to put in the comment that copying {{pos}} to {{curPos}} is an optimization? +1 after that, pending Jenkins run. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-6102: - Assignee: Andrew Wang Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933856#comment-13933856 ] Andrew Wang commented on HDFS-6102: --- Doing some back of the envelope math while looking at INodeDirectorySection in fsimage.proto, we save a packed uint64 per child. These are varints, but let's assume worst case and they use the full 10 bytes. Thus, with the 64MB default max message size, we arrive at 6.7 million entries. There are a couple approaches here: - Split the directory section up into multiple messages, such that each message is under the limit - Up the default from 64MB to the maximum supported value of 512MB, release note, and assume no one will realistically hit this - Enforce a configurable maximum on the # of entries per directory I think #3 is the best solution here, under the assumption that no one will need 6 million things in a directory. Still needs to be release noted of course. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Status: Patch Available (was: Open) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Attachment: HDFS-6097.005.patch * Add a comment about the curPos optimization * add a few more comments to {{tryReadZeroCopy}} zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Status: Open (was: Patch Available) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933864#comment-13933864 ] Chris Nauroth commented on HDFS-6097: - +1 for v5 pending Jenkins. Thanks again, Colin. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933868#comment-13933868 ] Akira AJISAKA commented on HDFS-5978: - bq. I appreciate if you can separate it into a new jira. I'll separate it. bq. I don't quite follow why you need an extra pipeline stage to dump the string into the channel buffer. The pipeline stage is to encode UTF-8 String to the channel buffer. It is required to dump UTF-8 String directly to the channel buffer. Or do you mean ChannelBuffer should be used instead of String to create a response content? Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933874#comment-13933874 ] Colin Patrick McCabe commented on HDFS-6097: bq. \[try-finally\] That's a good point, I guess. I had been assuming that the cleanup wasn't really required after a test failure, but that might not be a good assumption. In particular, we'd like to know if the subsequent tests succeeded or failed... bq. FWIW, my current favorite way to do this is cluster initialization in a BeforeClass method, cluster shutdown in an AfterClass method, and sometimes close of individual streams or file systems in an After method depending on what the test is doing. This reigns in the code clutter of try-finally. It's not always convenient though if you need to change Configuration in each test or if you need per-test isolation for some other reason. It does feel natural to use the Before method, but it also can be inflexible, like you mentioned. I think on balance I usually prefer creating a common function or class that I can have several test functions share. But it does require a try... finally and some extra boilerplate. I wish there were a way to make Before methods apply to only some test methods, or at least modify the configuration they use. bq. Thanks for posting v4. Were you also going to put in the comment that copying pos to curPos is an optimization? +1 after that, pending Jenkins run. added, thanks zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933876#comment-13933876 ] Andrew Wang commented on HDFS-6102: --- I also did a quick audit of the rest of fsimage.proto, and I think the other repeated fields are okay. INodeFile has a repeated BlockProto of up to size 30B, but we already have a default max # of blocks per file limit of 1 million so this should be okay (30MB 64MB). Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-5978: Attachment: HDFS-5978.3.patch Separated {{GETFILESTATUS}} support and fixed findbug warnings. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.
[ https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miodrag Radulovic updated HDFS-5516: Attachment: HDFS-5516.patch I have added two basic tests for simple auth configuration. WebHDFS does not require user name when anonymous http requests are disallowed. --- Key: HDFS-5516 URL: https://issues.apache.org/jira/browse/HDFS-5516 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 3.0.0, 1.2.1, 2.2.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5516.patch, HDFS-5516.patch WebHDFS requests do not require user name to be specified in the request URL even when in core-site configuration options HTTP authentication is set to simple, and anonymous authentication is disabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933985#comment-13933985 ] Haohui Mai commented on HDFS-6102: -- It might be sufficient to putting it into the release note. I agree with you that realistically it is quite unlikely to see someone put 6.7m inode as the direct children into a single directory. I'm a little hesitate to introduce a new configuration just for this reason. I wonder, is the namespace quota offering a super set of this functionality? It might be more natural to enforce this in the scope of the namespace quota. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6007) Update documentation about short-circuit local reads
[ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-6007: --- Attachment: HDFS-6007-4.patch attaching updated patch. - removed the section about ZCR, - added the description about permission on legacy SCR, - removed the table of configurations, - added configurations to hdfs-default.xml. Update documentation about short-circuit local reads Key: HDFS-6007 URL: https://issues.apache.org/jira/browse/HDFS-6007 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Priority: Minor Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch, HDFS-6007-4.patch updating the contents of HDFS SHort-Circuit Local Reads based on the changes in HDFS-4538 and HDFS-4953. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934026#comment-13934026 ] Suresh Srinivas commented on HDFS-6102: --- bq. Enforce a configurable maximum on the # of entries per directory I think this is reasonable. Recently we changed the default max length of file name allowed. We should also add reasonable limit to the number of entries in a directory. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934031#comment-13934031 ] Travis Thompson commented on HDFS-6084: --- Yeah, having external links on intranet sites is usually looked down upon. Or maybe force it to open in a new tab? Either way, I don't think the page logo should link back to Apache, I click it constantly expecting to go back to the main Namenode page and remembering that's not where it goes. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Attachments: HDFS-6084.1.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934058#comment-13934058 ] Haohui Mai commented on HDFS-6084: -- Let's just remove all the links but leave the text. It seems to me that this solution leads to minimal confusion. There are three external links in the current web UI. Two of them are in {{dfshealth.html}}, and one of them in {{explorer.html}}. [~tthompso], can you please submit a new patch that remove all external links? Thanks. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Attachments: HDFS-6084.1.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934062#comment-13934062 ] Hadoop QA commented on HDFS-6097: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634503/HDFS-6097.004.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6393//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6393//console This message is automatically generated. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable
[ https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934066#comment-13934066 ] Brandon Li commented on HDFS-6080: -- There are some format issues with the doc change. I've fixed them when committing the patch. Thank you, Abin, for the contribution! Improve NFS gateway performance by making rtmax and wtmax configurable -- Key: HDFS-6080 URL: https://issues.apache.org/jira/browse/HDFS-6080 Project: Hadoop HDFS Issue Type: Improvement Components: nfs, performance Reporter: Abin Shahab Assignee: Abin Shahab Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the maximum read and write capacity of the server. Therefore, these affect the read and write performance. We ran performance tests with 1mb, 100mb, and 1GB files. We noticed significant performance decline with the size increase when compared to fuse. We realized that the issue was with the hardcoded rtmax size(64k). When we increased the rtmax to 1MB, we got a 10x improvement in performance. NFS reads: +---++---+---+---++--+ | File | Size | Run 1 | Run 2 | Run 3 | Average| Std. Dev.| | testFile100Mb | 104857600 | 23.131158137 | 19.24552955 | 19.793332866 | 20.72334018435 | 1.7172094782219731 | | testFile1Gb | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 | 212.5355729113 | 8.14037175506561 | | testFile1Mb | 1048576| 0.330546906 | 0.256391808 | 0.28730168 | 0.291413464667 | 0.030412987573361663 | +---++---+---+---++--+ Fuse reads: +---++-+--+--++---+ | File | Size | Run 1 | Run 2| Run 3| Average| Std. Dev. | | testFile100Mb | 104857600 | 2.394459443 | 2.695265191 | 2.50046517 | 2.530063267997 | 0.12457410127142007 | | testFile1Gb | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 24.69662577297 | 0.386672412437576 | | testFile1Mb | 1048576| 0.271615094 | 0.270835986 | 0.271796438 | 0.271415839333 | 0.0004166483951065848 | +---++-+--+--++---+ (NFS read after rtmax = 1MB) +---++--+-+--+-+-+ | File | Size | Run 1| Run 2 | Run 3| Average | Std. Dev.| | testFile100Mb | 104857600 | 3.655261869 | 3.438676067 | 3.557464787 | 3.550467574336 | 0.0885591069882058 | | testFile1Gb | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 36.66074083135 | 1.4389615098060426 | | testFile1Mb | 1048576| 0.115602858 | 0.106826253 | 0.125229976 | 0.1158863623334 | 0.007515962395481867 | +---++--+-+--+-+-+ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable
[ https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6080: - Attachment: HDFS-6080.patch Uploaded the committed patch. Improve NFS gateway performance by making rtmax and wtmax configurable -- Key: HDFS-6080 URL: https://issues.apache.org/jira/browse/HDFS-6080 Project: Hadoop HDFS Issue Type: Improvement Components: nfs, performance Reporter: Abin Shahab Assignee: Abin Shahab Fix For: 2.4.0 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the maximum read and write capacity of the server. Therefore, these affect the read and write performance. We ran performance tests with 1mb, 100mb, and 1GB files. We noticed significant performance decline with the size increase when compared to fuse. We realized that the issue was with the hardcoded rtmax size(64k). When we increased the rtmax to 1MB, we got a 10x improvement in performance. NFS reads: +---++---+---+---++--+ | File | Size | Run 1 | Run 2 | Run 3 | Average| Std. Dev.| | testFile100Mb | 104857600 | 23.131158137 | 19.24552955 | 19.793332866 | 20.72334018435 | 1.7172094782219731 | | testFile1Gb | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 | 212.5355729113 | 8.14037175506561 | | testFile1Mb | 1048576| 0.330546906 | 0.256391808 | 0.28730168 | 0.291413464667 | 0.030412987573361663 | +---++---+---+---++--+ Fuse reads: +---++-+--+--++---+ | File | Size | Run 1 | Run 2| Run 3| Average| Std. Dev. | | testFile100Mb | 104857600 | 2.394459443 | 2.695265191 | 2.50046517 | 2.530063267997 | 0.12457410127142007 | | testFile1Gb | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 24.69662577297 | 0.386672412437576 | | testFile1Mb | 1048576| 0.271615094 | 0.270835986 | 0.271796438 | 0.271415839333 | 0.0004166483951065848 | +---++-+--+--++---+ (NFS read after rtmax = 1MB) +---++--+-+--+-+-+ | File | Size | Run 1| Run 2 | Run 3| Average | Std. Dev.| | testFile100Mb | 104857600 | 3.655261869 | 3.438676067 | 3.557464787 | 3.550467574336 | 0.0885591069882058 | | testFile1Gb | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 36.66074083135 | 1.4389615098060426 | | testFile1Mb | 1048576| 0.115602858 | 0.106826253 | 0.125229976 | 0.1158863623334 | 0.007515962395481867 | +---++--+-+--+-+-+ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable
[ https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6080: - Fix Version/s: 2.4.0 Improve NFS gateway performance by making rtmax and wtmax configurable -- Key: HDFS-6080 URL: https://issues.apache.org/jira/browse/HDFS-6080 Project: Hadoop HDFS Issue Type: Improvement Components: nfs, performance Reporter: Abin Shahab Assignee: Abin Shahab Fix For: 2.4.0 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the maximum read and write capacity of the server. Therefore, these affect the read and write performance. We ran performance tests with 1mb, 100mb, and 1GB files. We noticed significant performance decline with the size increase when compared to fuse. We realized that the issue was with the hardcoded rtmax size(64k). When we increased the rtmax to 1MB, we got a 10x improvement in performance. NFS reads: +---++---+---+---++--+ | File | Size | Run 1 | Run 2 | Run 3 | Average| Std. Dev.| | testFile100Mb | 104857600 | 23.131158137 | 19.24552955 | 19.793332866 | 20.72334018435 | 1.7172094782219731 | | testFile1Gb | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 | 212.5355729113 | 8.14037175506561 | | testFile1Mb | 1048576| 0.330546906 | 0.256391808 | 0.28730168 | 0.291413464667 | 0.030412987573361663 | +---++---+---+---++--+ Fuse reads: +---++-+--+--++---+ | File | Size | Run 1 | Run 2| Run 3| Average| Std. Dev. | | testFile100Mb | 104857600 | 2.394459443 | 2.695265191 | 2.50046517 | 2.530063267997 | 0.12457410127142007 | | testFile1Gb | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 24.69662577297 | 0.386672412437576 | | testFile1Mb | 1048576| 0.271615094 | 0.270835986 | 0.271796438 | 0.271415839333 | 0.0004166483951065848 | +---++-+--+--++---+ (NFS read after rtmax = 1MB) +---++--+-+--+-+-+ | File | Size | Run 1| Run 2 | Run 3| Average | Std. Dev.| | testFile100Mb | 104857600 | 3.655261869 | 3.438676067 | 3.557464787 | 3.550467574336 | 0.0885591069882058 | | testFile1Gb | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 36.66074083135 | 1.4389615098060426 | | testFile1Mb | 1048576| 0.115602858 | 0.106826253 | 0.125229976 | 0.1158863623334 | 0.007515962395481867 | +---++--+-+--+-+-+ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934091#comment-13934091 ] Andrew Wang commented on HDFS-6102: --- Thanks for the comments, Haohui and Suresh. I think this is actually easier than I thought, since there's already a config parameter to limit directory size (dfs.namenode.fs-limits.max-directory-items). If we just change the default to 1024*1024 or something, that might be enough. I'm currently reading through the code to make sure it works and doing manual testing, will post a (likely trivial) patch soon. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934129#comment-13934129 ] Hadoop QA commented on HDFS-6097: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634509/HDFS-6097.005.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6394//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6394//console This message is automatically generated. zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6038: Attachment: HDFS-6038.008.patch Update the patch to address Todd's comments. The main change is to add a new unit test in TestJournal. In the new test we writes some editlog that JNs cannot decode, and verifies that the JN can utilize the length field to scan the segment. Allow JournalNode to handle editlog produced by new release with future layoutversion - Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, HDFS-6038.008.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6038: Status: Patch Available (was: Open) Allow JournalNode to handle editlog produced by new release with future layoutversion - Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, HDFS-6038.008.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6084: -- Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~tthompso] for the contribution. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934201#comment-13934201 ] Hadoop QA commented on HDFS-5978: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634515/HDFS-5978.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6395//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6395//console This message is automatically generated. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6099: Attachment: HDFS-6099.1.patch The attached patch checks max path component length and max children during renames. I reworked {{TestFsLimits}} quite a bit to do real file system operations instead of directly accessing private {{FSDirectory}} methods. That helped me write the new rename tests, and it also ends up covering more of the real {{FSDirectory}} code. HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-6099.1.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6099: Status: Patch Available (was: Open) HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-6099.1.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6099: -- Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~cnauroth] for the contribution. HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6099: -- Comment: was deleted (was: I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~cnauroth] for the contribution.) HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6099: Status: Patch Available (was: Reopened) HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB
[ https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6097: --- Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) zero-copy reads are incorrectly disabled on file offsets above 2GB -- Key: HDFS-6097 URL: https://issues.apache.org/jira/browse/HDFS-6097 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.4.0 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, HDFS-6097.005.patch Zero-copy reads are incorrectly disabled on file offsets above 2GB due to some code that is supposed to disable zero-copy reads on offsets in block files greater than 2GB (because MappedByteBuffer segments are limited to that size). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas reopened HDFS-6084: --- Sorry for accidentally resolving this jira. Reopening it. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-6084: -- Comment: was deleted (was: I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~tthompso] for the contribution.) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934255#comment-13934255 ] Haohui Mai commented on HDFS-6084: -- Looks good to me. +1 pending jenkins. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934254#comment-13934254 ] Akira AJISAKA commented on HDFS-5978: - The test failure is reported by HDFS-5997 and looks unrelated to the patch. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6099: Attachment: HDFS-6099.2.patch I'm attaching patch v2 with one more small change. I added {{PathComponentTooLongException}} and {{MaxDirectoryItemsExceededException}} to the terse exceptions list. These are ultimately caused by bad client requests, so there isn't any value in writing the full stack trace to the NameNode logs. HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6102: -- Attachment: hdfs-6102-1.patch Patch attached. It's dead simple, just ups the default in DFSConfigKeys and hdfs-default.xml, and adds some notes. I also took the opportunity to set the max component limit in DFSConfigKeys, since I noticed that HDFS-6055 didn't do that. I manually tested by adding a million dirs to a dir, and we hit the limit. NN was able to startup again afterwards, and the fsimage itself was only 78MB (most of that probably going to the INode names). I think this is best case, not worst case, since IIRC the inode numbers start low and count up, but if someone wants to verify my envelope math I think it's good to go. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6102: -- Status: Patch Available (was: Open) Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934262#comment-13934262 ] Haohui Mai commented on HDFS-6102: -- It looks mostly good to me. The only comment I have is that the code should no longer support unlimited number of children in a directory, which is in {{FSDirectory#verifyMaxDirItems()}}. {code} if (maxDirItems == 0) { return; } {code} Otherwise users might run into a problem that the saved fsimage cannot be consumed. Do you think it is a good idea to enforce a maximum limit, say, 6.7m based on your calculation? Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.
[ https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5516: Target Version/s: 3.0.0, 1-win, 1.3.0, 2.4.0 (was: 3.0.0, 1-win, 1.3.0) Hadoop Flags: Reviewed Status: Patch Available (was: In Progress) +1 for the patch. Thanks for adding the tests. I'm clicking the Submit Patch button to give it a Jenkins test run. I see target versions were set to 1.x too. Do you want to attach a patch for branch-1? WebHDFS does not require user name when anonymous http requests are disallowed. --- Key: HDFS-5516 URL: https://issues.apache.org/jira/browse/HDFS-5516 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.2.0, 1.2.1, 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5516.patch, HDFS-5516.patch WebHDFS requests do not require user name to be specified in the request URL even when in core-site configuration options HTTP authentication is set to simple, and anonymous authentication is disabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6102: -- Attachment: hdfs-6102-2.patch Good idea Haohui, new patch adds some precondition checks and removes that if statement. Also a new test for the preconditions. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934307#comment-13934307 ] Haohui Mai commented on HDFS-6102: -- +1 pending jenkins Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads
[ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934314#comment-13934314 ] Hadoop QA commented on HDFS-6007: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634538/HDFS-6007-4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6396//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6396//console This message is automatically generated. Update documentation about short-circuit local reads Key: HDFS-6007 URL: https://issues.apache.org/jira/browse/HDFS-6007 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Priority: Minor Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch, HDFS-6007-4.patch updating the contents of HDFS SHort-Circuit Local Reads based on the changes in HDFS-4538 and HDFS-4953. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934316#comment-13934316 ] Konstantin Shvachko commented on HDFS-6087: --- Not sure I fully understood what you propose. So please feel free to correct if I am wrong. # Sounds like you propose to update blockID every time the pipeline fails and that will guarantee block immutability. Isn't that similar to how current HDFS uses generationStamp? When pipeline fails HDFS increments genStamp making previously created replicas outdated. # Seems you propose to introduce an extra commitBlock() call to NN. Current HDFS has similar logic. Block commit is incorporated with addBlock() and complete() calls. E.g. addBlock() changes state to committed of the previous block of the file and then allocates the new one. # Don't see how you get rid of lease recovery. The purpose of which is to reconcile different replicas of the incomplete last block, as they can have different lengths or genStamps on different DNs, as the results of the client or DNs failure in the middle of a data transfer. If you propose to discard uncommitted blocks entirely, then it will break current semantics, which states that if a byte was read by a client once it should be readable by other clients as well. # I guess it boils down to that your diagrams show regular work-flow, but don't consider failure scenarios. Unify HDFS write/append/truncate Key: HDFS-6087 URL: https://issues.apache.org/jira/browse/HDFS-6087 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Guo Ruijing Attachments: HDFS Design Proposal.pdf In existing implementation, HDFS file can be appended and HDFS block can be reopened for append. This design will introduce complexity including lease recovery. If we design HDFS block as immutable, it will be very simple for append truncate. The idea is that HDFS block is immutable if the block is committed to namenode. If the block is not committed to namenode, it is HDFS client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6100: - Attachment: HDFS-6100.000.patch DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch While running slive with a webhdfs file system reducers fail as they keep trying to write to standby namenode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6100: - Summary: DataNodeWebHdfsMethods does not failover in HA mode (was: webhdfs filesystem does not failover in HA mode) DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch While running slive with a webhdfs file system reducers fail as they keep trying to write to standby namenode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6100: - Status: Patch Available (was: Open) DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch While running slive with a webhdfs file system reducers fail as they keep trying to write to standby namenode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6100: - Description: In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. Currently the parameter is set by the NN and it is a host-ip pair, which does not support HA. (was: While running slive with a webhdfs file system reducers fail as they keep trying to write to standby namenode.) DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. Currently the parameter is set by the NN and it is a host-ip pair, which does not support HA. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934349#comment-13934349 ] Hadoop QA commented on HDFS-6084: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634558/HDFS-6084.2.patch.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCheckpoint {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6397//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6397//console This message is automatically generated. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934360#comment-13934360 ] Haohui Mai commented on HDFS-6100: -- The v0 patch overloads the meaning of the URL parameter {{namenoderpcaddress}}. It is the host-port pair of the NN in non-HA mode, but it becomes the nameservice id in HA mode. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. Currently the parameter is set by the NN and it is a host-ip pair, which does not support HA. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934370#comment-13934370 ] Hadoop QA commented on HDFS-6038: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634559/HDFS-6038.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer The test build failed in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6398//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6398//console This message is automatically generated. Allow JournalNode to handle editlog produced by new release with future layoutversion - Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, HDFS-6038.008.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.
[ https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934376#comment-13934376 ] Hadoop QA commented on HDFS-5516: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634527/HDFS-5516.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6401//console This message is automatically generated. WebHDFS does not require user name when anonymous http requests are disallowed. --- Key: HDFS-5516 URL: https://issues.apache.org/jira/browse/HDFS-5516 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 3.0.0, 1.2.1, 2.2.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5516.patch, HDFS-5516.patch WebHDFS requests do not require user name to be specified in the request URL even when in core-site configuration options HTTP authentication is set to simple, and anonymous authentication is disabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold
[ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934384#comment-13934384 ] Jing Zhao commented on HDFS-6094: - I can also reproduce the issue on my local machine. Looks like the issue is: 1. After the standby NN restarts, DN1 sends first the incremental block report then the complete block report to SBN. 2. DN2 sends the incremental block report to SBN. This block report will not change the replica number in SBN because the corresponding storage ID has not been added in SBN yet (the storage ID will only be added during the full block report processing). However, the SBN still checks the current live replica number (which is 1 because SBN already received the full block report from DN1) and use the number to update the safe block count. So maybe a simple fix can be: {code} @@ -2277,7 +2277,7 @@ private Block addStoredBlock(final BlockInfo block, if(storedBlock.getBlockUCState() == BlockUCState.COMMITTED numLiveReplicas = minReplication) { storedBlock = completeBlock(bc, storedBlock, false); -} else if (storedBlock.isComplete()) { +} else if (storedBlock.isComplete() added) { // check whether safe replication is reached for the block // only complete blocks are counted towards that // Is no-op if not in safe mode. {code} The same block can be counted twice towards safe mode threshold --- Key: HDFS-6094 URL: https://issues.apache.org/jira/browse/HDFS-6094 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal {{BlockManager#addStoredBlock}} can cause the same block can be counted towards safe mode threshold. We see this manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More details to follow in a comment. Exception details: {code} Time elapsed: 12.874 sec FAILURE! java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically in 28 seconds.' at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-6084. -- Resolution: Fixed Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934388#comment-13934388 ] Haohui Mai commented on HDFS-6084: -- I've committed the patch to trunk, branch-2 and branch-2.4. Thanks [~tthompso] for the contribution. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
[ https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934398#comment-13934398 ] Hudson commented on HDFS-6084: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5325 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5325/]) HDFS-6084. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage. Contributed by Travis Thompson. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577401) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html Namenode UI - Hadoop logo link shouldn't go to hadoop homepage Key: HDFS-6084 URL: https://issues.apache.org/jira/browse/HDFS-6084 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Travis Thompson Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt When clicking the Hadoop title the user is taken to the Hadoop homepage, which feels unintuitive. There's already a link at the bottom where it's always been, which is reasonable. I think that the title should go to the main Namenode page, #tab-overview. Suggestions? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold
[ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934407#comment-13934407 ] Jing Zhao commented on HDFS-6094: - Another option is to add new storage id even for incremental block report. [~arpitagarwal], what do you think? The same block can be counted twice towards safe mode threshold --- Key: HDFS-6094 URL: https://issues.apache.org/jira/browse/HDFS-6094 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal {{BlockManager#addStoredBlock}} can cause the same block can be counted towards safe mode threshold. We see this manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More details to follow in a comment. Exception details: {code} Time elapsed: 12.874 sec FAILURE! java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically in 28 seconds.' at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold
[ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6094: Attachment: TestHASafeMode-output.txt Attach the log of the test that reproduced the failure. I injected an exception for each increment of safe block count. The same block can be counted twice towards safe mode threshold --- Key: HDFS-6094 URL: https://issues.apache.org/jira/browse/HDFS-6094 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: TestHASafeMode-output.txt {{BlockManager#addStoredBlock}} can cause the same block can be counted towards safe mode threshold. We see this manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More details to follow in a comment. Exception details: {code} Time elapsed: 12.874 sec FAILURE! java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically in 28 seconds.' at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.
jun aoki created HDFS-6103: -- Summary: FSImage file system image version check throw a (slightly) wrong parameter. Key: HDFS-6103 URL: https://issues.apache.org/jira/browse/HDFS-6103 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: jun aoki Priority: Trivial Trivial error message issue: When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start namenode with upgrade option. e.g. {code} sudo service namenode upgrade {code} That said, the actual error while without the option said -upgrade (with a hyphen) {code} 2014-03-13 23:38:15,488 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: File system image contains an old layout version -40. An upgrade to version -47 is required. Please restart NameNode with -upgrade option. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320) 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202 / ~ {code} I'm referring to 2.0.5 above, https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225 I haven't tried the trunk but it seems to return UPGRADE (all upper case) which again anther slightly wrong error description. https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold
[ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934472#comment-13934472 ] Jing Zhao commented on HDFS-6094: - Maybe another issue with the current code is that when an incremental block report comes before the full block report, if the stored block state is COMMITTED, we may increase the safemode total block number while not increase the safe block count. In that case I'm not sure if the NN can get stuck in the safemode. The same block can be counted twice towards safe mode threshold --- Key: HDFS-6094 URL: https://issues.apache.org/jira/browse/HDFS-6094 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: TestHASafeMode-output.txt {{BlockManager#addStoredBlock}} can cause the same block can be counted towards safe mode threshold. We see this manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More details to follow in a comment. Exception details: {code} Time elapsed: 12.874 sec FAILURE! java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically in 28 seconds.' at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493) at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934485#comment-13934485 ] Guo Ruijing commented on HDFS-6087: --- I plan to remove snapshot part and add one work-flow for write/append/truncate and more work-flow for exception handle in design proposal. The basic idea: 1) block is immutable. if block is committed to NN, we can copy the bock instead of append the block and commit to NN. 2) before block is committed to NN, it is client's repsonsibility to readd it if fails and other client cannot read that block. so we don't need generationStamp to recover the block. 3) after block is committed to NN, file length is updated in NN so that client cannot see uncommitted block. 4) write/append/truncate have same logic. 1. Update BlockID before commit failure including pipeline failure. The design proposal try to remove generationStamp. 2. extra copyBlock(oldBlockID, newBlockID, length) is used for append and truncate. 3. commitBlock a) block will be immutable b) remove all blocks after offset to implement truncate append 3) update file length. 4. if block is not committed to namenode, file length is not updated and client cannot read the block. 5. I will add more failure scenarios Unify HDFS write/append/truncate Key: HDFS-6087 URL: https://issues.apache.org/jira/browse/HDFS-6087 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Guo Ruijing Attachments: HDFS Design Proposal.pdf In existing implementation, HDFS file can be appended and HDFS block can be reopened for append. This design will introduce complexity including lease recovery. If we design HDFS block as immutable, it will be very simple for append truncate. The idea is that HDFS block is immutable if the block is committed to namenode. If the block is not committed to namenode, it is HDFS client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory
[ https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934492#comment-13934492 ] Hadoop QA commented on HDFS-6102: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634587/hdfs-6102-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6399//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6399//console This message is automatically generated. Cannot load an fsimage with a very large directory -- Key: HDFS-6102 URL: https://issues.apache.org/jira/browse/HDFS-6102 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Blocker Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch Found by [~schu] during testing. We were creating a bunch of directories in a single directory to blow up the fsimage size, and it ends up we hit this error when trying to load a very large fsimage: {noformat} 2014-03-13 13:57:03,901 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 INodes. 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile(file=/dfs/nn/current/fsimage_00024532742, cpktTxId=00024532742) com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901) at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896) at 52) ... {noformat} Some further research reveals there's a 64MB max size per PB message, which seems to be what we're hitting here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934497#comment-13934497 ] Hadoop QA commented on HDFS-6099: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634573/HDFS-6099.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6400//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6400//console This message is automatically generated. HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)