[jira] [Updated] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files
[ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jihoon Son updated HDFS-4931: - Attachment: HDFS-4931.patch Extend the block placement policy interface to utilize the location information of previously stored files Key: HDFS-4931 URL: https://issues.apache.org/jira/browse/HDFS-4931 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jihoon Son Attachments: HDFS-4931.patch Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files
[ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jihoon Son updated HDFS-4931: - Attachment: (was: HDFS-4931.patch) Extend the block placement policy interface to utilize the location information of previously stored files Key: HDFS-4931 URL: https://issues.apache.org/jira/browse/HDFS-4931 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jihoon Son Attachments: HDFS-4931.patch Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files
[ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692883#comment-13692883 ] Fengdong Yu commented on HDFS-4931: --- I don't think this is good. if a data only placed a few data nodes, then it's likely more map tasks run on the same node Extend the block placement policy interface to utilize the location information of previously stored files Key: HDFS-4931 URL: https://issues.apache.org/jira/browse/HDFS-4931 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jihoon Son Attachments: HDFS-4931.patch Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly
[ https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692884#comment-13692884 ] Fengdong Yu commented on HDFS-1172: --- bq. This litters the task logs with the NotReplicatedYetException This does look like client require a new block before the previous block pipeline is not finished. Blocks in newly completed files are considered under-replicated too quickly --- Key: HDFS-1172 URL: https://issues.apache.org/jira/browse/HDFS-1172 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.21.0 Reporter: Todd Lipcon Fix For: 0.24.0 Attachments: HDFS-1172.patch, hdfs-1172.txt, hdfs-1172.txt, replicateBlocksFUC1.patch, replicateBlocksFUC1.patch, replicateBlocksFUC.patch I've seen this for a long time, and imagine it's a known issue, but couldn't find an existing JIRA. It often happens that we see the NN schedule replication on the last block of files very quickly after they're completed, before the other DNs in the pipeline have a chance to report the new block. This results in a lot of extra replication work on the cluster, as we replicate the block and then end up with multiple excess replicas which are very quickly deleted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files
[ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692897#comment-13692897 ] Steve Loughran commented on HDFS-4931: -- I can see the benefits of this in some applications -though MR jobs aren't necessarily it, as scattering the blocks gives you better bandwidth. by keeping them all one node, the max bandwidth is the #of HDDs on that node, minus all other work going on on those disks. If scattered, the bandwidth is the #of blocks of the file, minus other work going on against the same blocks. To make things worse -any other code that is trying to access another file on the same machine is going to fight for exactly the same set of hard disks. # the failure mode of the cluster will change. You should look at that carefully. # you aren't going to handle a full disk very well, as at that point your constraints don't get satisfied. # rebalance and recovery time will increase, as now all the rebalanced blocks are being directed to a single server, limited by both the HDD and net bandwidth of that device, rather than the aggregate bandwidth of the cluster. Assuming all three copies of a file's blocks are stored only on 3 machines, you get hurt at both ends. As that time to recover increases, exposure to multiple HDD/node failures increases too. I think it may be an interesting experiment, but you need to start looking at the impact of failures, and the performance problems. Overall, though, I'm not convinced it scales well, either to large files or large clusters -the latter offering the IO and network bandwidth this policy would fail to exploit, and the highest failure rates. Normally that failure rate is a background noise, but with this placement policy, it may be more visible. What may be more useful is revisiting Facebook's work on sub-cluster placement policy, where all blocks of a file are stored in the same set of racks in a larger cluster. You get more chance of rack locality for multiple blocks, and when a rack fails, while some files suffer more, a lot of files suffer less -and recovery bandwidth is restricted to a fraction of the net, which, on a multi-layered network, may protect the backbone. Because its experimental and has scale issues, I don't see a rush to commit patches to support it unless its backed up by the theory and the data justifying this tactic. Extend the block placement policy interface to utilize the location information of previously stored files Key: HDFS-4931 URL: https://issues.apache.org/jira/browse/HDFS-4931 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jihoon Son Attachments: HDFS-4931.patch Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
Fengdong Yu created HDFS-4932: - Summary: Avoid a long line on the name node webUI if we have more Journal nodes Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: HDFS-4932.patch Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: (was: HDFS-4932.patch) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Description: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show four journal nodes on each line. (was: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show fourjournal nodes on each line.) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show four journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4927) CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode.
[ https://issues.apache.org/jira/browse/HDFS-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692916#comment-13692916 ] Hudson commented on HDFS-4927: -- Integrated in Hadoop-Yarn-trunk #251 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/251/]) HDFS-4927. CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. Contributed by Chris Nauroth. (Revision 1496350) Result = FAILURE cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1496350 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCreateEditsLog.java CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. -- Key: HDFS-4927 URL: https://issues.apache.org/jira/browse/HDFS-4927 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HDFS-4927.1.patch {{CreateEditsLog#addFiles}} always creates inodes with ID hard-coded to {{INodeId#GRANDFATHER_INODE_ID}}. At initialization time, namenode will not load the resulting edits, because this is an invalid inode ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Description: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show fourjournal nodes on each line. (was: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line.) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show fourjournal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Description: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. (was: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show four journal nodes on each line.) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: HDFS-4932.patch Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Status: Patch Available (was: Open) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Description: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. (was: If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line.) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692980#comment-13692980 ] Hadoop QA commented on HDFS-4932: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589585/HDFS-4932.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4563//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4563//console This message is automatically generated. Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4934) add symlink support to WebHDFS server side
Alejandro Abdelnur created HDFS-4934: Summary: add symlink support to WebHDFS server side Key: HDFS-4934 URL: https://issues.apache.org/jira/browse/HDFS-4934 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.2.0 Environment: followup on HADOOP-8040 Reporter: Alejandro Abdelnur follow up on HADOOP-8040 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4935) add symlink support to HttpFS server side
Alejandro Abdelnur created HDFS-4935: Summary: add symlink support to HttpFS server side Key: HDFS-4935 URL: https://issues.apache.org/jira/browse/HDFS-4935 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Environment: followup on HADOOP-8040 Reporter: Alejandro Abdelnur follow up on HADOOP-8040 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4927) CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode.
[ https://issues.apache.org/jira/browse/HDFS-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693006#comment-13693006 ] Hudson commented on HDFS-4927: -- Integrated in Hadoop-Hdfs-trunk #1441 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1441/]) HDFS-4927. CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. Contributed by Chris Nauroth. (Revision 1496350) Result = FAILURE cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1496350 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCreateEditsLog.java CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. -- Key: HDFS-4927 URL: https://issues.apache.org/jira/browse/HDFS-4927 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HDFS-4927.1.patch {{CreateEditsLog#addFiles}} always creates inodes with ID hard-coded to {{INodeId#GRANDFATHER_INODE_ID}}. At initialization time, namenode will not load the resulting edits, because this is an invalid inode ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4936) Handle overflow condition for txid going over Long.MAX_VALUE
Harsh J created HDFS-4936: - Summary: Handle overflow condition for txid going over Long.MAX_VALUE Key: HDFS-4936 URL: https://issues.apache.org/jira/browse/HDFS-4936 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Hat tip to [~fengdon...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current 13/06/25 18:31:08 INFO namenode.FileJournalManager: Finalizing edits file /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_inprogress_9223372036854775806 - /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_9223372036854775806-9223372036854775807 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 9223372036854775806 but unable to find any edit logs containing txid -9223372036854775808 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1194) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:616) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:590) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205) {code} Looks like we also lose some edits when we restart, as noted by the finalized edits filename: {code} VERSION edits_9223372036854775806-9223372036854775807 fsimage_9223372036854775805 fsimage_9223372036854775805.md5 seen_txid {code} It seems like we won't be able to handle the case where txid overflows. Its a very very large number so that's not an immediate concern but seemed worthy of a report. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4927) CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode.
[ https://issues.apache.org/jira/browse/HDFS-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693036#comment-13693036 ] Hudson commented on HDFS-4927: -- Integrated in Hadoop-Mapreduce-trunk #1468 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1468/]) HDFS-4927. CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. Contributed by Chris Nauroth. (Revision 1496350) Result = FAILURE cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1496350 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCreateEditsLog.java CreateEditsLog creates inodes with an invalid inode ID, which then cannot be loaded by a namenode. -- Key: HDFS-4927 URL: https://issues.apache.org/jira/browse/HDFS-4927 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HDFS-4927.1.patch {{CreateEditsLog#addFiles}} always creates inodes with ID hard-coded to {{INodeId#GRANDFATHER_INODE_ID}}. At initialization time, namenode will not load the resulting edits, because this is an invalid inode ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files
[ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693047#comment-13693047 ] Jihoon Son commented on HDFS-4931: -- Thanks for your comments. I'll think more about this idea. Extend the block placement policy interface to utilize the location information of previously stored files Key: HDFS-4931 URL: https://issues.apache.org/jira/browse/HDFS-4931 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jihoon Son Attachments: HDFS-4931.patch Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4936) Handle overflow condition for txid going over Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/HDFS-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693112#comment-13693112 ] Harsh J commented on HDFS-4936: --- Expected this response. Resolving. (From [~tlipcon] over hdfs-dev@) {code} I did some back of the envelope math when implementing txids, and determined that overflow is not ever going to happen... A busy namenode does 1000 write transactions/second (2^10). MAX_LONG is 2^63. So, we can run for 2^63 seconds. A year is about 2^25 seconds. So, at 1k tps, you can run your namenode for 2^(63-10-25) = 268 million years. Hadoop is great software and I'm sure it will be around for years to come, but if it's still running in 268 million years, that will be a pretty depressing rate of technological progress! -Todd {code} Handle overflow condition for txid going over Long.MAX_VALUE Key: HDFS-4936 URL: https://issues.apache.org/jira/browse/HDFS-4936 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Hat tip to [~fengdon...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current 13/06/25 18:31:08 INFO namenode.FileJournalManager: Finalizing edits file /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_inprogress_9223372036854775806 - /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_9223372036854775806-9223372036854775807 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 9223372036854775806 but unable to find any edit logs containing txid -9223372036854775808 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1194) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:616) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:590) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205) {code} Looks like we also lose some edits when we restart, as noted by the finalized edits filename: {code} VERSION edits_9223372036854775806-9223372036854775807 fsimage_9223372036854775805 fsimage_9223372036854775805.md5 seen_txid {code} It seems like we won't be able to handle the case where txid overflows. Its a very very large number so that's not an immediate concern but seemed worthy of a report. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA
[jira] [Resolved] (HDFS-4936) Handle overflow condition for txid going over Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/HDFS-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HDFS-4936. --- Resolution: Not A Problem Handle overflow condition for txid going over Long.MAX_VALUE Key: HDFS-4936 URL: https://issues.apache.org/jira/browse/HDFS-4936 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Hat tip to [~fengdon...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current 13/06/25 18:31:08 INFO namenode.FileJournalManager: Finalizing edits file /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_inprogress_9223372036854775806 - /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_9223372036854775806-9223372036854775807 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 9223372036854775806 but unable to find any edit logs containing txid -9223372036854775808 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1194) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:616) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:590) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205) {code} Looks like we also lose some edits when we restart, as noted by the finalized edits filename: {code} VERSION edits_9223372036854775806-9223372036854775807 fsimage_9223372036854775805 fsimage_9223372036854775805.md5 seen_txid {code} It seems like we won't be able to handle the case where txid overflows. Its a very very large number so that's not an immediate concern but seemed worthy of a report. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity
[ https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693124#comment-13693124 ] Kihwal Lee commented on HDFS-4888: -- Ravi, here are my review comments: * {{//Automatic safemode}} : the resource-low case is also automatic. Perhaps you meant start-up safe mode? * Removal of the {{datanodeThreshold 0}} condition: by setting this threshold to 0, the check is disabled. So we still want this check in the original place and also added around the message, needs additional x live datanodes. * When printing out the extension period, {{reached + extension - now()}} can become negative if repl queue init takes more than the extension (30 seconds by default). Use of Linux THP makes this a lot faster, but it can still exceed 30 seconds for large name spaces. If that happens, the time-to-exit increases every time the message is printed. Rather than Math.abs(), it should say something like soon. Refactor and fix FSNamesystem.getTurnOffTip to sanity - Key: HDFS-4888 URL: https://issues.apache.org/jira/browse/HDFS-4888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-4888.patch, HDFS-4888.patch e.g. When resources are low, the command to leave safe mode is not printed. This method is unnecessarily complex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-4762: - Attachment: HDFS-4762.patch.3 load the patch again to run findbugs. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693308#comment-13693308 ] Hadoop QA commented on HDFS-4762: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589647/HDFS-4762.patch.3 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to cause Findbugs (version 1.3.9) to fail. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4564//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4564//console This message is automatically generated. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693315#comment-13693315 ] Tsz Wo (Nicholas), SZE commented on HDFS-4762: -- load the patch again to run findbugs. Brandon, there are errors when running findbugs on hadoop-hdfs-nfs; see https://builds.apache.org/job/PreCommit-HDFS-Build/4564/artifact/trunk/patchprocess/patchFindBugsOutputhadoop-hdfs-nfs.txt [ERROR] Could not find resource '/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/dev-support/findbugsExcludeFile.xml'. - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ResourceNotFoundException Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693362#comment-13693362 ] Tsz Wo (Nicholas), SZE commented on HDFS-4762: -- Some more comments: - In READDIR3Response and READDIRPLUS3Response, use System.arraycopy(..) for coping entries. {code} + for (int i = 0; i entries.length; i++) { +this.entries[i] = entries[i]; + } {code} - Use the following for Nfs3Utils.bytesToLong(..) and do not use ByteBuffer. {code} public static long bytesToLong(byte[] data) { long n = 0xffL data[0]; for(int i = 1; i 8; i++) { n = (n 8) | (0xffL data[i]); } return n; } {code} - In OffsetRange.hasOverlap(..), it seems that the case min == rangeMax and the case max == rangeMin are not handled correctly. Indeed, hasOverlap(..) can be considered as !(noOverlap) and noOverlap must be min rangeMax || max rangeMin, i.e. {code} boolean noOverlap(OffsetRange range) { return min range.max || max range.min; } boolean hasOverlap(OffsetRange range) { return !noOverlap(range); } {code} - OffsetRange may not implement Comparable correctly. The return value of compareTo has to follow the rules in [compareTo javadoc|http://docs.oracle.com/javase/6/docs/api/java/lang/Comparable.html#compareTo%28T%29]. We cannot define it as 0: identical -1: on the left 1: on the right 2: overlapped. In particular, if x.compareTo( y) return 2 (i.e. when x and y overlapped), then y.compare( x) also return 2. Such x and y do not follow compareTo rules. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693393#comment-13693393 ] Chris Nauroth commented on HDFS-4932: - Hi, Fengdong. The patch looks good. A couple of minor comments: # Can you please add a comment stating that this code inserts a line break every 3 entries to prevent very wide lines? # Minor style issue for each of the following lines: can you please add a space before opening braces? Also, can you please add a space after {{for}}? {code} if (null != manager){ {code} {code} for(int i = 0; i managers.length; ++i){ {code} {code} if (i managers.length - 1){ {code} {code} if ((i+1) % 3 == 0){ {code} {quote} -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {quote} I think this is OK if manual testing was done. You'd have to do a lot of awkward deep mocking to be able to call this method from a test, so it's probably not worth it. Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
Kihwal Lee created HDFS-4937: Summary: ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 0.23.8, 2.0.4-alpha Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the old cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the previously recorded cluster size, so it will loop infinitely. This has been seen in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-4937: - Description: When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This has been seen in a production environment. was: When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the old cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the previously recorded cluster size, so it will loop infinitely. This has been seen in a production environment. ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() -- Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This has been seen in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-4937: - Description: When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment. was: When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This has been seen in a production environment. ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() -- Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693424#comment-13693424 ] Kihwal Lee commented on HDFS-4937: -- This can mostly be avoided by decommissioning nodes in a smaller batch, which is the recommended practice. But for this particular case, the operator added a large number of new nodes and decommissioned old nodes. ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() -- Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4938) Reduce redundant information in edit logs and image files
Arpit Agarwal created HDFS-4938: --- Summary: Reduce redundant information in edit logs and image files Key: HDFS-4938 URL: https://issues.apache.org/jira/browse/HDFS-4938 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Generation stamps are logged as edits and in image files on checkpoint. This is potentially redundant as the generation stamp is also logged with block creation/append. Jira is to investigate and remove any redundant fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: HDFS-4932.PNG Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, HDFS-4932.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: HDFS-4932-002.patch Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, HDFS-4932.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693546#comment-13693546 ] Fengdong Yu commented on HDFS-4932: --- Thanks Chris, the new patch uploaded. I compiled the change and restart our testing cluster.the manual testing result was also uploaded. Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, HDFS-4932.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4936) Handle overflow condition for txid going over Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/HDFS-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4936: -- Description: Hat tip to [~azury...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current 13/06/25 18:31:08 INFO namenode.FileJournalManager: Finalizing edits file /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_inprogress_9223372036854775806 - /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_9223372036854775806-9223372036854775807 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 9223372036854775806 but unable to find any edit logs containing txid -9223372036854775808 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1194) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:616) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:590) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205) {code} Looks like we also lose some edits when we restart, as noted by the finalized edits filename: {code} VERSION edits_9223372036854775806-9223372036854775807 fsimage_9223372036854775805 fsimage_9223372036854775805.md5 seen_txid {code} It seems like we won't be able to handle the case where txid overflows. Its a very very large number so that's not an immediate concern but seemed worthy of a report. was: Hat tip to [~fengdon...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized
[jira] [Updated] (HDFS-4936) Handle overflow condition for txid going over Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/HDFS-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4936: -- Description: Hat tip to [~azuryy] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current 13/06/25 18:31:08 INFO namenode.FileJournalManager: Finalizing edits file /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_inprogress_9223372036854775806 - /Users/harshchouraria/Work/installs/temp-space/tmp-default/dfs-cdh4/name/current/edits_9223372036854775806-9223372036854775807 13/06/25 18:31:08 FATAL namenode.NameNode: Exception in namenode join java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 9223372036854775806 but unable to find any edit logs containing txid -9223372036854775808 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1194) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:616) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:590) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205) {code} Looks like we also lose some edits when we restart, as noted by the finalized edits filename: {code} VERSION edits_9223372036854775806-9223372036854775807 fsimage_9223372036854775805 fsimage_9223372036854775805.md5 seen_txid {code} It seems like we won't be able to handle the case where txid overflows. Its a very very large number so that's not an immediate concern but seemed worthy of a report. was: Hat tip to [~azury...@gmail.com] for the question that lead to this (on mailing lists). I hacked up my local NN's txids manually to go very large (close to max) and decided to try out if this causes any harm. I basically bumped up the freshly formatted files' starting txid to 9223372036854775805 (and ensured image references the same by hex-editing it): {code} ➜ current ls VERSION fsimage_9223372036854775805.md5 fsimage_9223372036854775805 seen_txid ➜ current cat seen_txid 9223372036854775805 {code} NameNode started up as expected. {code} 13/06/25 18:30:08 INFO namenode.FSImage: Image file of size 129 loaded in 0 seconds. 13/06/25 18:30:08 INFO namenode.FSImage: Loaded image for txid 9223372036854775805 from /temp-space/tmp-default/dfs-cdh4/name/current/fsimage_9223372036854775805 13/06/25 18:30:08 INFO namenode.FSEditLog: Starting log segment at 9223372036854775806 {code} I could create a bunch of files and do regular ops (counting to much after the long max increments). I created over 10 files, just to make it go well over the Long.MAX_VALUE. Quitting NameNode and restarting fails though, with the following error: {code} 13/06/25 18:31:08 INFO namenode.FileJournalManager: Recovering unfinalized segments in
[jira] [Commented] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693548#comment-13693548 ] Hadoop QA commented on HDFS-4932: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589691/HDFS-4932.PNG against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4565//console This message is automatically generated. Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, HDFS-4932.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: (was: HDFS-4932.PNG) Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengdong Yu updated HDFS-4932: -- Attachment: scree-short.PNG Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, scree-short.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4932) Avoid a long line on the name node webUI if we have more Journal nodes
[ https://issues.apache.org/jira/browse/HDFS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693560#comment-13693560 ] Hadoop QA commented on HDFS-4932: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589694/scree-short.PNG against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4566//console This message is automatically generated. Avoid a long line on the name node webUI if we have more Journal nodes -- Key: HDFS-4932 URL: https://issues.apache.org/jira/browse/HDFS-4932 Project: Hadoop HDFS Issue Type: Improvement Components: ha, namenode Reporter: Fengdong Yu Assignee: Fengdong Yu Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4932-002.patch, HDFS-4932.patch, scree-short.PNG If we have more Journal nodes, It shows a long line on the name node webUI, this patch wrapped line. just show three journal nodes on each line. I don't change CSS because I don't want to affect other related web styles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira