[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940229#comment-13940229 ] Jing Zhao commented on HDFS-6089: - Hi Andrew, thanks for the explanation. I guess I understand your concern now: only rolling on ANN based on edits # may cause issue in some scenario. This is because if we don't have further operations it is possible that SBN will wait a long time to tail that part of edits which is in an in-progress segment. bq. Checkpointing combines the edit log with the fsimage, and we purge unnecessary log segments afterwards. But I'm still a little confused about this part. I fail to see the difference of the based-on-time rolling from SBN and ANN. In the current code, SBN triggers rolling still through RPC to ANN. Also this does not affect checkpointing and purging: when SBN does a checkpoint, both SBN and ANN will purge old edits in their own storage (SBN does the purging before uploading the checkpoint, and ANN does it after getting the new fsimage). So I guess a possible solution may be: just letting ANN does rolling every 2min. I think this can achieve almost the same effect as the current mechanism, without delaying the failover. Or you see some counter examples with this change? Back to the changing the rpc timeout solution. Looks like we have not set timeout for this NN--NN rpc right now (correct me if I'm wrong). Setting a timeout (e.g., 20s just like the default timeout from client to NN) of course can improve the failover time in our test case, but I still prefer the above solution because it makes the rolling behavior simpler and more predictable (especially it removes the rpc call from SBN to ANN). Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended Key: HDFS-6089 URL: https://issues.apache.org/jira/browse/HDFS-6089 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch The following scenario was tested: * Determine Active NN and suspend the process (kill -19) * Wait about 60s to let the standby transition to active * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active. What was noticed that some times the call to get the service state of nn2 got a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940234#comment-13940234 ] Hadoop QA commented on HDFS-6100: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635471/HDFS-6100.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6435//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6435//console This message is automatically generated. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. Currently the parameter is set by the NN and it is a host-ip pair, which does not support HA. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6100: - Description: In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). was:In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. Currently the parameter is set by the NN and it is a host-ip pair, which does not support HA. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940245#comment-13940245 ] Haohui Mai commented on HDFS-6100: -- The v1 patch addresses Jing's comments. I've tested the patch in all four combinations of secure and HA setups. I've verified this patch by using WebHDFS to retrieve the content of a file. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6124) Add final modifier to class members
[ https://issues.apache.org/jira/browse/HDFS-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940244#comment-13940244 ] Hadoop QA commented on HDFS-6124: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635475/HDFS-6124.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 139 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6436//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6436//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6436//console This message is automatically generated. Add final modifier to class members --- Key: HDFS-6124 URL: https://issues.apache.org/jira/browse/HDFS-6124 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-6124.patch Many of the member variables declaration for classes are missing final modifier in HDFS. This jira adds final modifier where possible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6125) Cleanup unnecessary cast in HDFS code base
[ https://issues.apache.org/jira/browse/HDFS-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940286#comment-13940286 ] Hadoop QA commented on HDFS-6125: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635487/HDFS-6125.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 43 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.qjournal.TestNNWithQJM {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6438//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6438//console This message is automatically generated. Cleanup unnecessary cast in HDFS code base -- Key: HDFS-6125 URL: https://issues.apache.org/jira/browse/HDFS-6125 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-6125.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated HDFS-5196: - Attachment: HDFS-5196-5.patch I attach the patch which added a test cord to a former patch. Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940351#comment-13940351 ] Hadoop QA commented on HDFS-6100: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635471/HDFS-6100.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6439//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6439//console This message is automatically generated. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6068) Disallow snapshot names that are also invalid directory names
[ https://issues.apache.org/jira/browse/HDFS-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940396#comment-13940396 ] Hudson commented on HDFS-6068: -- FAILURE: Integrated in Hadoop-Yarn-trunk #514 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/514/]) Add TestSnapshotNameWithInvalidCharacters.java for HDFS-6068. (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579005) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotNameWithInvalidCharacters.java HDFS-6068. Disallow snapshot names that are also invalid directory names. Contributed by sathish (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579004) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java Disallow snapshot names that are also invalid directory names - Key: HDFS-6068 URL: https://issues.apache.org/jira/browse/HDFS-6068 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: sathish Fix For: 2.4.0 Attachments: HDFS-6068-0001.patch, HDFS-6068-0002.patch, HDFS-6068-0003.patch, HDFS-6068-0004.patch There are a number of restrictions on valid names in HDFS. For example, you can't name a directory . or .., or something containing a :. However, I can happily create a snapshot named a:b:c, resulting in this: {code} - % hdfs dfs -createSnapshot /home/andrew a:b:c Created snapshot /home/andrew/.snapshot/a:b:c - % hadoop fs -ls /home/andrew/.snapshot -ls: java.net.URISyntaxException: Relative path in absolute URI: a:b:c Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6117) Print file path information in FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940398#comment-13940398 ] Hudson commented on HDFS-6117: -- FAILURE: Integrated in Hadoop-Yarn-trunk #514 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/514/]) HDFS-6117. Print file path information in FileNotFoundException on INode ID mismatch. Contributed by Suresh Srinivas. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579104) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeId.java Print file path information in FileNotFoundException Key: HDFS-6117 URL: https://issues.apache.org/jira/browse/HDFS-6117 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6117.patch Print file path information in FileNotFoundException thrown from INodeId#checkId(). This helps debug issues related possible INode Id mismatches during namenode fail over. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads
[ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940397#comment-13940397 ] Hudson commented on HDFS-6007: -- FAILURE: Integrated in Hadoop-Yarn-trunk #514 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/514/]) HDFS-6007. Update documentation about short-circuit local reads (iwasakims via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1578994) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm Update documentation about short-circuit local reads Key: HDFS-6007 URL: https://issues.apache.org/jira/browse/HDFS-6007 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch, HDFS-6007-4.patch, HDFS-6007-5.patch updating the contents of HDFS SHort-Circuit Local Reads based on the changes in HDFS-4538 and HDFS-4953. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6112) NFS Gateway docs are incorrect for allowed hosts configuration
[ https://issues.apache.org/jira/browse/HDFS-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940401#comment-13940401 ] Hudson commented on HDFS-6112: -- FAILURE: Integrated in Hadoop-Yarn-trunk #514 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/514/]) HDFS-6112. NFS Gateway docs are incorrect for allowed hosts configuration. Contributed by Aaron T. Myers. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579097) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm NFS Gateway docs are incorrect for allowed hosts configuration -- Key: HDFS-6112 URL: https://issues.apache.org/jira/browse/HDFS-6112 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6112.patch The NFS gateway export configuration docs say that the machine name configuration can be wildcards and provides the example {{host*.example.com}}. The term wildcard and this example might imply typical globbing semantics, but in fact what it actually supports is Java regular expressions. I think we should change the docs to make this clearer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940402#comment-13940402 ] Hudson commented on HDFS-6099: -- FAILURE: Integrated in Hadoop-Yarn-trunk #514 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/514/]) HDFS-6099. HDFS file system limits not enforced on renames. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579122) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch, HDFS-6099.3.patch, HDFS-6099.4.patch, HDFS-6099.5.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940427#comment-13940427 ] Hadoop QA commented on HDFS-5196: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635516/HDFS-5196-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeConfig {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6440//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6440//console This message is automatically generated. Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated HDFS-5196: - Status: Patch Available (was: Open) Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated HDFS-5196: - Status: Open (was: Patch Available) Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5996) hadoop 1.1.2. hdfs write bug
[ https://issues.apache.org/jira/browse/HDFS-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangmeng resolved HDFS-5996. Resolution: Fixed Release Note: someone has discovered this bug and it has been resolved hadoop 1.1.2. hdfs write bug --- Key: HDFS-5996 URL: https://issues.apache.org/jira/browse/HDFS-5996 Project: Hadoop HDFS Issue Type: Bug Components: fuse-dfs Affects Versions: 1.1.2 Environment: one master and three slave ,all of them are normal Reporter: wangmeng Fix For: 1.1.2 Original Estimate: 504h Remaining Estimate: 504h I am a student from China ,my research is Hive data storage on hadoop .There is a hdfs-write bug when I used sql : insert overwrite table wangmeng select *from testTable (this sql is translated into N map( no Reduce) jobs,each map .corresponding to a HDFS file output On disk. ) No matter what value N is , there will always exists some DfsdataoutputStream buffer can not write to disk at last ,such as N=160 files ,then there my be about 5 write-faliure files .,the write-failured hdfs--file size on disk is always 0 bytes rather than a value which is between 0 and zhe correct size. .There does not have any exceptions to throw . and the HDFS WRITTEN statistical data is absolutely correct . When I debug , I find those write-failed DFS-buffer own absolutely correct values on its buffer ,but the buffer can not write to disk at last although I use Dfsdataoutputstream.flush() , Dfsdataoutputstream close() . .I can not find the reason those dfs-buffer can not write success. Now I choose a method to avoide this problem by using a temporary file : for example , if the DFS-buffer will write to its destination FINAL, now I will let this DFS-buffer write to a temporary file TEM first ,and then I move the TEM data to the destination just by change the hdfs-- file path. This method can avoid the DFS-buffer write -failure .Now I want to fix this problem radically , so How can I patch my codes about this problem and is there anything I can do ? Many Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichi Yamashita updated HDFS-5196: - Attachment: HDFS-5196-6.patch Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196-6.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6112) NFS Gateway docs are incorrect for allowed hosts configuration
[ https://issues.apache.org/jira/browse/HDFS-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940498#comment-13940498 ] Hudson commented on HDFS-6112: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1706 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1706/]) HDFS-6112. NFS Gateway docs are incorrect for allowed hosts configuration. Contributed by Aaron T. Myers. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579097) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm NFS Gateway docs are incorrect for allowed hosts configuration -- Key: HDFS-6112 URL: https://issues.apache.org/jira/browse/HDFS-6112 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6112.patch The NFS gateway export configuration docs say that the machine name configuration can be wildcards and provides the example {{host*.example.com}}. The term wildcard and this example might imply typical globbing semantics, but in fact what it actually supports is Java regular expressions. I think we should change the docs to make this clearer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6117) Print file path information in FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940495#comment-13940495 ] Hudson commented on HDFS-6117: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1706 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1706/]) HDFS-6117. Print file path information in FileNotFoundException on INode ID mismatch. Contributed by Suresh Srinivas. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579104) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeId.java Print file path information in FileNotFoundException Key: HDFS-6117 URL: https://issues.apache.org/jira/browse/HDFS-6117 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6117.patch Print file path information in FileNotFoundException thrown from INodeId#checkId(). This helps debug issues related possible INode Id mismatches during namenode fail over. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940499#comment-13940499 ] Hudson commented on HDFS-6099: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1706 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1706/]) HDFS-6099. HDFS file system limits not enforced on renames. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579122) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch, HDFS-6099.3.patch, HDFS-6099.4.patch, HDFS-6099.5.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6068) Disallow snapshot names that are also invalid directory names
[ https://issues.apache.org/jira/browse/HDFS-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940493#comment-13940493 ] Hudson commented on HDFS-6068: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1706 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1706/]) Add TestSnapshotNameWithInvalidCharacters.java for HDFS-6068. (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579005) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotNameWithInvalidCharacters.java HDFS-6068. Disallow snapshot names that are also invalid directory names. Contributed by sathish (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579004) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java Disallow snapshot names that are also invalid directory names - Key: HDFS-6068 URL: https://issues.apache.org/jira/browse/HDFS-6068 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: sathish Fix For: 2.4.0 Attachments: HDFS-6068-0001.patch, HDFS-6068-0002.patch, HDFS-6068-0003.patch, HDFS-6068-0004.patch There are a number of restrictions on valid names in HDFS. For example, you can't name a directory . or .., or something containing a :. However, I can happily create a snapshot named a:b:c, resulting in this: {code} - % hdfs dfs -createSnapshot /home/andrew a:b:c Created snapshot /home/andrew/.snapshot/a:b:c - % hadoop fs -ls /home/andrew/.snapshot -ls: java.net.URISyntaxException: Relative path in absolute URI: a:b:c Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads
[ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940494#comment-13940494 ] Hudson commented on HDFS-6007: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1706 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1706/]) HDFS-6007. Update documentation about short-circuit local reads (iwasakims via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1578994) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm Update documentation about short-circuit local reads Key: HDFS-6007 URL: https://issues.apache.org/jira/browse/HDFS-6007 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch, HDFS-6007-4.patch, HDFS-6007-5.patch updating the contents of HDFS SHort-Circuit Local Reads based on the changes in HDFS-4538 and HDFS-4953. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads
[ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940535#comment-13940535 ] Hudson commented on HDFS-6007: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1731 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1731/]) HDFS-6007. Update documentation about short-circuit local reads (iwasakims via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1578994) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ShortCircuitLocalReads.apt.vm Update documentation about short-circuit local reads Key: HDFS-6007 URL: https://issues.apache.org/jira/browse/HDFS-6007 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch, HDFS-6007-4.patch, HDFS-6007-5.patch updating the contents of HDFS SHort-Circuit Local Reads based on the changes in HDFS-4538 and HDFS-4953. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6068) Disallow snapshot names that are also invalid directory names
[ https://issues.apache.org/jira/browse/HDFS-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940534#comment-13940534 ] Hudson commented on HDFS-6068: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1731 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1731/]) Add TestSnapshotNameWithInvalidCharacters.java for HDFS-6068. (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579005) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotNameWithInvalidCharacters.java HDFS-6068. Disallow snapshot names that are also invalid directory names. Contributed by sathish (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579004) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshot.java Disallow snapshot names that are also invalid directory names - Key: HDFS-6068 URL: https://issues.apache.org/jira/browse/HDFS-6068 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: sathish Fix For: 2.4.0 Attachments: HDFS-6068-0001.patch, HDFS-6068-0002.patch, HDFS-6068-0003.patch, HDFS-6068-0004.patch There are a number of restrictions on valid names in HDFS. For example, you can't name a directory . or .., or something containing a :. However, I can happily create a snapshot named a:b:c, resulting in this: {code} - % hdfs dfs -createSnapshot /home/andrew a:b:c Created snapshot /home/andrew/.snapshot/a:b:c - % hadoop fs -ls /home/andrew/.snapshot -ls: java.net.URISyntaxException: Relative path in absolute URI: a:b:c Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6117) Print file path information in FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940536#comment-13940536 ] Hudson commented on HDFS-6117: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1731 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1731/]) HDFS-6117. Print file path information in FileNotFoundException on INode ID mismatch. Contributed by Suresh Srinivas. (suresh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579104) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeId.java Print file path information in FileNotFoundException Key: HDFS-6117 URL: https://issues.apache.org/jira/browse/HDFS-6117 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 2.4.0 Attachments: HDFS-6117.patch Print file path information in FileNotFoundException thrown from INodeId#checkId(). This helps debug issues related possible INode Id mismatches during namenode fail over. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6112) NFS Gateway docs are incorrect for allowed hosts configuration
[ https://issues.apache.org/jira/browse/HDFS-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940539#comment-13940539 ] Hudson commented on HDFS-6112: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1731 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1731/]) HDFS-6112. NFS Gateway docs are incorrect for allowed hosts configuration. Contributed by Aaron T. Myers. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579097) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm NFS Gateway docs are incorrect for allowed hosts configuration -- Key: HDFS-6112 URL: https://issues.apache.org/jira/browse/HDFS-6112 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6112.patch The NFS gateway export configuration docs say that the machine name configuration can be wildcards and provides the example {{host*.example.com}}. The term wildcard and this example might imply typical globbing semantics, but in fact what it actually supports is Java regular expressions. I think we should change the docs to make this clearer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.
[ https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940540#comment-13940540 ] Hudson commented on HDFS-6099: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1731 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1731/]) HDFS-6099. HDFS file system limits not enforced on renames. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579122) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java HDFS file system limits not enforced on renames. Key: HDFS-6099 URL: https://issues.apache.org/jira/browse/HDFS-6099 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch, HDFS-6099.3.patch, HDFS-6099.4.patch, HDFS-6099.5.patch {{dfs.namenode.fs-limits.max-component-length}} and {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the destination path during rename operations. This means that it's still possible to create files that violate these limits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940568#comment-13940568 ] Hadoop QA commented on HDFS-5196: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635537/HDFS-5196-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6441//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6441//console This message is automatically generated. Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196-6.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940727#comment-13940727 ] Hudson commented on HDFS-6100: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5358 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5358/]) HDFS-6100. DataNodeWebHdfsMethods does not failover in HA mode. Contributed by Haohui Mai. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579301) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/resources/DatanodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/InetSocketAddressParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/NamenodeAddressParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/NamenodeRpcAddressParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSForHA.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6100: Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~wheat9] for the contribution. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode
[ https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940716#comment-13940716 ] Jing Zhao commented on HDFS-6100: - +1 for the latest patch. DataNodeWebHdfsMethods does not failover in HA mode --- Key: HDFS-6100 URL: https://issues.apache.org/jira/browse/HDFS-6100 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6100.000.patch, HDFS-6100.001.patch In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to the NN, so that it can access the files in the cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate the NN. This implementation has two problems: # The {{DFSClient}} only knows about the current active NN, thus it does not support failover. # The delegation token is based on the active NN, therefore the {{DFSClient}} will fail to authenticate of the standby NN in secure HA setup. Currently the parameter {{namenoderpcaddress}} in the URL stores the host-ip pair that corresponds to the active NN. To fix this bug, this jira proposes to store the name service id in the parameter in HA setup (yet the parameter stays the same in non-HA setup). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940762#comment-13940762 ] Haohui Mai commented on HDFS-6105: -- Thanks [~kihwal] and [~tthompso] for reporting this issue. This is because the corresponding function {{load_datanode_info()}} is called multiple times. This is because: # {{load_datanode_info()}} is registered with the link multiple times. # The {{hashchange}} handler calls {{load_datanode_info()}} when it detects the change of the hash tag. NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Attachments: datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai reassigned HDFS-6105: Assignee: Haohui Mai NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6105: - Attachment: HDFS-6105.000.patch NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6105: - Status: Patch Available (was: Open) NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6123) Improve datanode error messages
[ https://issues.apache.org/jira/browse/HDFS-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940763#comment-13940763 ] Brandon Li commented on HDFS-6123: -- [~szetszwo], in the following change, the debug information will be logged in both cases, and the only difference is the severity level. Is my understanding correct: The exception is normal. We try to give it the severity as the lowest as possible but at the same time we need to make sure it will appear in the log. {noformat} - LOG.info(exception: , e); +if (LOG.isTraceEnabled()) { + LOG.trace(Failed to send data:, e); +} else { + LOG.info(Failed to send data: + e); +} {noformat} Improve datanode error messages --- Key: HDFS-6123 URL: https://issues.apache.org/jira/browse/HDFS-6123 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: 6123_20140318.patch [~yeshavora] found two cases that there are unnecessary exception stack trace in datanode log: - SocketTimeoutException {noformat} 2014-03-07 03:30:44,567 INFO datanode.DataNode (BlockSender.java:sendPacket(563)) - exception: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.xx.xx:1019 remote=/xx.xx.xx.xx:37997] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) ... {noformat} - ReplicaAlreadyExistsException {noformat} 2014-03-07 03:02:39,334 ERROR datanode.DataNode (DataXceiver.java:run(234)) - xx.xx.xx.xx:1019:DataXceiver error processing WRITE_BLOCK operation src: /xx.xx.xx.xx:32959 dest: /xx.xx.xx.xx:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1409640778-xx.xx.xx.xx-1394150965191:blk_1073742158_1334 already exists in state TEMPORARY and thus cannot be created. at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:874) ... {noformat} Both cases are normal. They are not bugs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6126) TestnameNodeMetrics#testCorruptBlock fails intermittently
Mit Desai created HDFS-6126: --- Summary: TestnameNodeMetrics#testCorruptBlock fails intermittently Key: HDFS-6126 URL: https://issues.apache.org/jira/browse/HDFS-6126 Project: Hadoop HDFS Issue Type: Test Affects Versions: 2.4.0 Reporter: Mit Desai I get the following error {noformat} testCorruptBlock(org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics) Time elapsed: 5.556 sec FAILURE! java.lang.AssertionError: Bad value for metric CorruptBlocks expected:1 but was:0 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.test.MetricsAsserts.assertGauge(MetricsAsserts.java:190) at org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.testCorruptBlock(TestNameNodeMetrics.java:247) Results : Failed tests: TestNameNodeMetrics.testCorruptBlock:247 Bad value for metric CorruptBlocks expected:1 but was:0 {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5957) Provide support for different mmap cache retention policies in ShortCircuitCache.
[ https://issues.apache.org/jira/browse/HDFS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5957. - Resolution: Done I'm going to go ahead and resolve this. I think we have everything we need. [~gopalv], I can work with you offline if there are still any tuning requirements. Thanks for the discussion, everyone. I think this helped clarify the implementation for everyone, and we also have a useful follow-up item in YARN-1747 to improve container memory monitoring. Provide support for different mmap cache retention policies in ShortCircuitCache. - Key: HDFS-5957 URL: https://issues.apache.org/jira/browse/HDFS-5957 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.3.0 Reporter: Chris Nauroth Currently, the {{ShortCircuitCache}} retains {{mmap}} regions for reuse by multiple reads of the same block or by multiple threads. The eventual {{munmap}} executes on a background thread after an expiration period. Some client usage patterns would prefer strict bounds on this cache and deterministic cleanup by calling {{munmap}}. This issue proposes additional support for different caching policies that better fit these usage patterns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940834#comment-13940834 ] Andrew Wang commented on HDFS-6089: --- Hmm, let me try to explain one more time. My concern wasn't about staleness here, but spamming the edits dirs with a lot of small files. I agree that having the ANN roll on a time basis is fine when SbNN and ANN are both up and checkpointing is happening. An issue arises if checkpoints aren't happening, either because the SbNN is down, or checkpoints are otherwise broken (e.g. the edit log op too big bug, transfer timeouts with a large fsimage, or some of the fallout from the PB-fication of the fsimage). In this scenario, the ANN will keep rolling every 2mins, generating a lot of edit log segments that aren't being cleared out. I've seen oblivious customers run for a month while checkpointing was nonfunctional, and 2 min rolling would lead to an awful lot of files. This is one reason why the ANN autoroller rolls on a size basis rather than time. It should generates fewer, larger segments, which is more manageable. This isn't a super major issue, but I thought I'd bring it up as a reason why we might prefer a different solution. Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended Key: HDFS-6089 URL: https://issues.apache.org/jira/browse/HDFS-6089 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch The following scenario was tested: * Determine Active NN and suspend the process (kill -19) * Wait about 60s to let the standby transition to active * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active. What was noticed that some times the call to get the service state of nn2 got a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6127) sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error
Arpit Gupta created HDFS-6127: - Summary: sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6127) sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940837#comment-13940837 ] Arpit Gupta commented on HDFS-6127: --- Here is the stack trace {code} /usr/lib/hadoop/bin/hadoop org.apache.hadoop.fs.slive.SliveTest -rename 14,uniform -packetSize 65536 -baseDir webhdfs://ha-2-secure/user/user/ha-slive -seed 12345678 -sleep 100,1000 -duration 600 -append 14,uniform -blockSize 16777216,33554432 -create 16,uniform -mkdir 14,uniform -maps 12 -ls 14,uniform -writeSize 1,134217728 -files 1024 -ops 1 -read 14,uniform -replication 1,3 -appendSize 1,134217728 -reduces 6 -resFile /grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out -readSize 1,4294967295 -dirSize 16 -delete 14,uniform INFO|Initial wait for Service namenode2: 60 14/03/18 07:24:44 INFO slive.SliveTest: Running with option list -rename 14,uniform -packetSize 65536 -baseDir webhdfs://ha-2-secure/user/user/ha-slive -seed 12345678 -sleep 100,1000 -duration 600 -append 14,uniform -blockSize 16777216,33554432 -create 16,uniform -mkdir 14,uniform -maps 12 -ls 14,uniform -writeSize 1,134217728 -files 1024 -ops 1 -read 14,uniform -replication 1,3 -appendSize 1,134217728 -reduces 6 -resFile /grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out -readSize 1,4294967295 -dirSize 16 -delete 14,uniform 14/03/18 07:24:44 INFO slive.SliveTest: Options are: 14/03/18 07:24:44 INFO slive.ConfigExtractor: Base directory = webhdfs://ha-2-secure/user/user/ha-slive/slive 14/03/18 07:24:44 INFO slive.ConfigExtractor: Data directory = webhdfs://ha-2-secure/user/user/ha-slive/slive/data 14/03/18 07:24:44 INFO slive.ConfigExtractor: Output directory = webhdfs://ha-2-secure/user/user/ha-slive/slive/output 14/03/18 07:24:44 INFO slive.ConfigExtractor: Result file = /grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out 14/03/18 07:24:44 INFO slive.ConfigExtractor: Grid queue = default 14/03/18 07:24:44 INFO slive.ConfigExtractor: Should exit on first error = false 14/03/18 07:24:44 INFO slive.ConfigExtractor: Duration = 60 milliseconds 14/03/18 07:24:44 INFO slive.ConfigExtractor: Map amount = 12 14/03/18 07:24:44 INFO slive.ConfigExtractor: Reducer amount = 6 14/03/18 07:24:44 INFO slive.ConfigExtractor: Operation amount = 1 14/03/18 07:24:44 INFO slive.ConfigExtractor: Total file limit = 1024 14/03/18 07:24:44 INFO slive.ConfigExtractor: Total dir file limit = 16 14/03/18 07:24:44 INFO slive.ConfigExtractor: Read size = 1,4294967295 bytes 14/03/18 07:24:44 INFO slive.ConfigExtractor: Write size = 1,134217728 bytes 14/03/18 07:24:44 INFO slive.ConfigExtractor: Append size = 1,134217728 bytes 14/03/18 07:24:44 INFO slive.ConfigExtractor: Block size = 16777216,33554432 bytes 14/03/18 07:24:44 INFO slive.ConfigExtractor: Random seed = 12345678 14/03/18 07:24:44 INFO slive.ConfigExtractor: Sleep range = 100,1000 milliseconds 14/03/18 07:24:44 INFO slive.ConfigExtractor: Replication amount = 1,3 14/03/18 07:24:44 INFO slive.ConfigExtractor: Operations are: 14/03/18 07:24:44 INFO slive.ConfigExtractor: READ 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: APPEND 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: MKDIR 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: LS 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: DELETE 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: RENAME 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 14% 14/03/18 07:24:44 INFO slive.ConfigExtractor: CREATE 14/03/18 07:24:44 INFO slive.ConfigExtractor: UNIFORM 14/03/18 07:24:44 INFO slive.ConfigExtractor: 16% 14/03/18 07:24:44 INFO slive.SliveTest: Running job: 14/03/18 07:24:44 WARN hdfs.DFSClient: dfs.client.test.drop.namenode.response.number is set to 1, this hacked client will proactively drop responses 14/03/18 07:24:45 WARN hdfs.DFSClient: dfs.client.test.drop.namenode.response.number is set to 1, this hacked client will proactively drop responses 14/03/18 07:24:45 WARN hdfs.DFSClient: dfs.client.test.drop.namenode.response.number is set to 1, this hacked client will proactively drop responses 14/03/18 07:24:48 WARN token.Token: Cannot find class for token kind WEBHDFS delegation 14/03/18 07:24:48 INFO security.TokenCache: Got dt for webhdfs://ha-2-secure; Kind: WEBHDFS delegation, Service: ha-hdfs:ha-2-secure, Ident: 00 06 68 72 74 5f 71 61
[jira] [Updated] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-4564: -- Status: Patch Available (was: Open) Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-4564: -- Attachment: HDFS-4564.patch Updated patch to properly deal with authentication exceptions. Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6127: - Summary: WebHDFS tokens cannot be renewed in HA setup (was: sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error) WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6127: - Description: {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6127: - Attachment: HDFS-6127.000.patch WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6127: - Status: Patch Available (was: Open) WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940949#comment-13940949 ] Hadoop QA commented on HDFS-6105: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635598/HDFS-6105.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6442//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6442//console This message is automatically generated. NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940956#comment-13940956 ] Haohui Mai commented on HDFS-6105: -- [~kihwal] and [~tthompso], can you please verify whether the patch works in your environment? Thanks. NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6123) Improve datanode error messages
[ https://issues.apache.org/jira/browse/HDFS-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940958#comment-13940958 ] Brandon Li commented on HDFS-6123: -- After the offline discussion with Nicholas, I understand the difference here: Basically we don't want to log the whole stack unless the trace severity is enabled. +1. Improve datanode error messages --- Key: HDFS-6123 URL: https://issues.apache.org/jira/browse/HDFS-6123 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: 6123_20140318.patch [~yeshavora] found two cases that there are unnecessary exception stack trace in datanode log: - SocketTimeoutException {noformat} 2014-03-07 03:30:44,567 INFO datanode.DataNode (BlockSender.java:sendPacket(563)) - exception: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.xx.xx:1019 remote=/xx.xx.xx.xx:37997] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) ... {noformat} - ReplicaAlreadyExistsException {noformat} 2014-03-07 03:02:39,334 ERROR datanode.DataNode (DataXceiver.java:run(234)) - xx.xx.xx.xx:1019:DataXceiver error processing WRITE_BLOCK operation src: /xx.xx.xx.xx:32959 dest: /xx.xx.xx.xx:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1409640778-xx.xx.xx.xx-1394150965191:blk_1073742158_1334 already exists in state TEMPORARY and thus cannot be created. at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:874) ... {noformat} Both cases are normal. They are not bugs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940982#comment-13940982 ] Travis Thompson commented on HDFS-6105: --- I've confirmed the patch is working on Hadoop 2.3.0. NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6123) Improve datanode error messages
[ https://issues.apache.org/jira/browse/HDFS-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6123: -- Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Brandon for reviewing the patch. I have committed this. Improve datanode error messages --- Key: HDFS-6123 URL: https://issues.apache.org/jira/browse/HDFS-6123 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.4.0 Attachments: 6123_20140318.patch [~yeshavora] found two cases that there are unnecessary exception stack trace in datanode log: - SocketTimeoutException {noformat} 2014-03-07 03:30:44,567 INFO datanode.DataNode (BlockSender.java:sendPacket(563)) - exception: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.xx.xx:1019 remote=/xx.xx.xx.xx:37997] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) ... {noformat} - ReplicaAlreadyExistsException {noformat} 2014-03-07 03:02:39,334 ERROR datanode.DataNode (DataXceiver.java:run(234)) - xx.xx.xx.xx:1019:DataXceiver error processing WRITE_BLOCK operation src: /xx.xx.xx.xx:32959 dest: /xx.xx.xx.xx:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1409640778-xx.xx.xx.xx-1394150965191:blk_1073742158_1334 already exists in state TEMPORARY and thus cannot be created. at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:874) ... {noformat} Both cases are normal. They are not bugs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6123) Improve datanode error messages
[ https://issues.apache.org/jira/browse/HDFS-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941031#comment-13941031 ] Hudson commented on HDFS-6123: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5360 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5360/]) HDFS-6123. Do not log stack trace for ReplicaAlreadyExistsException and SocketTimeoutException. (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579396) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java Improve datanode error messages --- Key: HDFS-6123 URL: https://issues.apache.org/jira/browse/HDFS-6123 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.4.0 Attachments: 6123_20140318.patch [~yeshavora] found two cases that there are unnecessary exception stack trace in datanode log: - SocketTimeoutException {noformat} 2014-03-07 03:30:44,567 INFO datanode.DataNode (BlockSender.java:sendPacket(563)) - exception: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xx.xx.xx.xx:1019 remote=/xx.xx.xx.xx:37997] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) ... {noformat} - ReplicaAlreadyExistsException {noformat} 2014-03-07 03:02:39,334 ERROR datanode.DataNode (DataXceiver.java:run(234)) - xx.xx.xx.xx:1019:DataXceiver error processing WRITE_BLOCK operation src: /xx.xx.xx.xx:32959 dest: /xx.xx.xx.xx:1019 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1409640778-xx.xx.xx.xx-1394150965191:blk_1073742158_1334 already exists in state TEMPORARY and thus cannot be created. at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:874) ... {noformat} Both cases are normal. They are not bugs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6110) adding more slow action log in critical write path
[ https://issues.apache.org/jira/browse/HDFS-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941064#comment-13941064 ] stack commented on HDFS-6110: - Patch LGTM [~xieliang007] Let me try it here. 300ms is eons. Probably good as a default. adding more slow action log in critical write path -- Key: HDFS-6110 URL: https://issues.apache.org/jira/browse/HDFS-6110 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0, 2.3.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-6110-v2.txt, HDFS-6110.txt After digging a HBase write spike issue caused by slow buffer io in our cluster, just realize we'd better to add more abnormal latency warning log in write flow, such that if other guys hit HLog sync spike, we could know more detail info from HDFS side at the same time. Patch will be uploaded soon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6128) Implement libhdfs bindings for HDFS ACL APIs.
Chris Nauroth created HDFS-6128: --- Summary: Implement libhdfs bindings for HDFS ACL APIs. Key: HDFS-6128 URL: https://issues.apache.org/jira/browse/HDFS-6128 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Reporter: Chris Nauroth -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5606) libHDFS: implement hdfsRemoveDefaultAcl.
[ https://issues.apache.org/jira/browse/HDFS-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5606. - Resolution: Duplicate libHDFS: implement hdfsRemoveDefaultAcl. Key: HDFS-5606 URL: https://issues.apache.org/jira/browse/HDFS-5606 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{hdfsRemoveDefaultAcl}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5605) libHDFS: implement hdfsModifyAclEntries, hdfsRemoveAclEntries and hdfsRemoveAcl.
[ https://issues.apache.org/jira/browse/HDFS-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5605. - Resolution: Duplicate libHDFS: implement hdfsModifyAclEntries, hdfsRemoveAclEntries and hdfsRemoveAcl. Key: HDFS-5605 URL: https://issues.apache.org/jira/browse/HDFS-5605 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{hdfsModifyAclEntries}} and {{hdfsRemoveAclEntries}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941083#comment-13941083 ] Hadoop QA commented on HDFS-4564: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635621/HDFS-4564.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs: org.apache.hadoop.fs.http.server.TestHttpFSServer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6443//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6443//console This message is automatically generated. Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5604) libHDFS: implement hdfsGetAcls and hdfsSetAcl.
[ https://issues.apache.org/jira/browse/HDFS-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5604. - Resolution: Duplicate libHDFS: implement hdfsGetAcls and hdfsSetAcl. -- Key: HDFS-5604 URL: https://issues.apache.org/jira/browse/HDFS-5604 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{hdfsGetAcls}} and {{hdfsSetAcl}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5621) NameNode: add indicator in web UI file system browser if a file has an ACL.
[ https://issues.apache.org/jira/browse/HDFS-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5621: Issue Type: Improvement (was: Sub-task) Parent: (was: HDFS-4685) NameNode: add indicator in web UI file system browser if a file has an ACL. --- Key: HDFS-5621 URL: https://issues.apache.org/jira/browse/HDFS-5621 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Assignee: Haohui Mai Attachments: HDFS-5621.000.patch Change the file system browser to append the '+' character to permissions of any file or directory that has an ACL. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5624) Add tests for ACLs in combination with viewfs.
[ https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5624: Issue Type: Test (was: Sub-task) Parent: (was: HDFS-4685) Add tests for ACLs in combination with viewfs. -- Key: HDFS-5624 URL: https://issues.apache.org/jira/browse/HDFS-5624 Project: Hadoop HDFS Issue Type: Test Components: hdfs-client Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Add tests verifying that in a federated deployment, a viewfs wrapped over multiple federated NameNodes will dispatch the ACL operations to the correct NameNode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941088#comment-13941088 ] Jing Zhao commented on HDFS-6105: - +1 NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5621) NameNode: add indicator in web UI file system browser if a file has an ACL.
[ https://issues.apache.org/jira/browse/HDFS-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5621: Target Version/s: 2.5.0 (was: HDFS ACLs (HDFS-4685)) Affects Version/s: (was: HDFS ACLs (HDFS-4685)) 2.4.0 NameNode: add indicator in web UI file system browser if a file has an ACL. --- Key: HDFS-5621 URL: https://issues.apache.org/jira/browse/HDFS-5621 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.4.0 Reporter: Chris Nauroth Assignee: Haohui Mai Attachments: HDFS-5621.000.patch Change the file system browser to append the '+' character to permissions of any file or directory that has an ACL. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5995) TestFSEditLogLoader#testValidateEditLogWithCorruptBody gets OutOfMemoryError and dumps heap.
[ https://issues.apache.org/jira/browse/HDFS-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5995: Target Version/s: 2.5.0 (was: 3.0.0) Affects Version/s: (was: 3.0.0) 2.4.0 TestFSEditLogLoader#testValidateEditLogWithCorruptBody gets OutOfMemoryError and dumps heap. Key: HDFS-5995 URL: https://issues.apache.org/jira/browse/HDFS-5995 Project: Hadoop HDFS Issue Type: Test Components: namenode, test Affects Versions: 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HDFS-5995.1.patch {{TestFSEditLogLoader#testValidateEditLogWithCorruptBody}} is experiencing {{OutOfMemoryError}} and dumping heap since the merge of HDFS-4685. This doesn't actually cause the test to fail, because it's a failure test that corrupts an edit log intentionally. Still, this might cause confusion if someone reviews the build logs and thinks this is a more serious problem. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5624) Add tests for ACLs in combination with viewfs.
[ https://issues.apache.org/jira/browse/HDFS-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5624: Target Version/s: 2.5.0 (was: HDFS ACLs (HDFS-4685)) Affects Version/s: (was: HDFS ACLs (HDFS-4685)) 2.4.0 Add tests for ACLs in combination with viewfs. -- Key: HDFS-5624 URL: https://issues.apache.org/jira/browse/HDFS-5624 Project: Hadoop HDFS Issue Type: Test Components: hdfs-client Affects Versions: 2.4.0 Reporter: Chris Nauroth Add tests verifying that in a federated deployment, a viewfs wrapped over multiple federated NameNodes will dispatch the ACL operations to the correct NameNode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6128) Implement libhdfs bindings for HDFS ACL APIs.
[ https://issues.apache.org/jira/browse/HDFS-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6128: Affects Version/s: 2.4.0 Implement libhdfs bindings for HDFS ACL APIs. - Key: HDFS-6128 URL: https://issues.apache.org/jira/browse/HDFS-6128 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Affects Versions: 2.4.0 Reporter: Chris Nauroth -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4685) Implementation of ACLs in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4685: Fix Version/s: (was: 3.0.0) Implementation of ACLs in HDFS -- Key: HDFS-4685 URL: https://issues.apache.org/jira/browse/HDFS-4685 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, security Affects Versions: 1.1.2 Reporter: Sachin Jose Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-4685-branch-2.1.patch, HDFS-4685.1.patch, HDFS-4685.2.patch, HDFS-4685.3.patch, HDFS-4685.4.patch, HDFS-ACLs-Design-1.pdf, HDFS-ACLs-Design-2.pdf, HDFS-ACLs-Design-3.pdf, Test-Plan-for-Extended-Acls-1.pdf, Test-Plan-for-Extended-Acls-2.pdf Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be achieved using getfacl and setfacl utilities. Is there anybody working on this feature ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-5117) Allow the owner of an HDFS path to be a group
[ https://issues.apache.org/jira/browse/HDFS-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-5117. - Resolution: Duplicate I'm resolving this as a duplicate of HDFS-4685. With the release of 2.4.0, this requirement can be implemented using HDFS ACLs to specify multiple named groups and named users with specific permissions. Allow the owner of an HDFS path to be a group - Key: HDFS-5117 URL: https://issues.apache.org/jira/browse/HDFS-5117 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Ryan Hennig Original Estimate: 48h Remaining Estimate: 48h At eBay, we have the need to associate some HDFS paths with a set of users with write access, a set of users with read-only access, and neither read or write to others. The current model of POSIX-style permissions is nearly sufficient for this, except for the need of multiple writers. One easy fix would be to allow the owner of a path to be a group, and then grant owner permissions to all members of that group. I have verified that HDP 1.3 allows you to set the owner of a path to a group without error, but the owner permissions of that group are not given to members of the group. I've created a relatively simple fix for this by modifying the check method in src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java and I am working on related changes to unit tests etc now. - Ryan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6093: -- Attachment: hdfs-6093-2.patch Expose more caching information for debugging by users -- Key: HDFS-6093 URL: https://issues.apache.org/jira/browse/HDFS-6093 Project: Hadoop HDFS Issue Type: Improvement Components: caching Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch When users submit a new cache directive, it's unclear if the NN has recognized it and is actively trying to cache it, or if it's hung for some other reason. It'd be nice to expose a pending caching/uncaching count the same way we expose pending replication work. It'd also be nice to display the aggregate cache capacity and usage in dfsadmin -report, since we already have have it as a metric and expose it per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-39) Namenode blockMap not updated when datanode invalidates a block on heart beat
[ https://issues.apache.org/jira/browse/HDFS-39?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-39. - Resolution: Not A Problem Anyway, this issue is Not A Problem anymore. Resolving ... Namenode blockMap not updated when datanode invalidates a block on heart beat - Key: HDFS-39 URL: https://issues.apache.org/jira/browse/HDFS-39 Project: Hadoop HDFS Issue Type: Bug Reporter: Wei Xu Here are related logs. dn_10_251_107_213 deleted the block immediately after it received it (because of over-replication). However, the information is not reported to namenode and name node ask this DN to invalidate the block again, causing the error Unexpected error trying to delete block blk_-1201258788540100829. BlockInfo not found in volumeMap.. dn_10_251_107_213 | 080722 145554 2497 INFO dfs.DataNode$DataXceiver: Receiving block blk_-1201258788540100829 src: /10.251.67.214:54770 dest: /10.251.67.214:50010 dn_10_251_193_205 | 080722 145554 2468 INFO dfs.DataNode$DataXceiver: Receiving block blk_-1201258788540100829 src: /10.251.107.213:36718 dest: /10.251.107.213:50010 dn_10_251_67_214 | 080722 145554 2306 INFO dfs.DataNode$DataXceiver: Receiving block blk_-1201258788540100829 src: /10.251.67.214:60599 dest: /10.251.67.214:50010 nn_10_251_210_161 | 080722 145554 30 INFO dfs.FSNamesystem: BLOCK* NameSystem.allocateBlock: /user/root/rand/_temporary/_task_200807221347_0001_m_92_1/part-00092. blk_-1201258788540100829 dn_10_251_107_213 | 080722 145656 2498 INFO dfs.DataNode$PacketResponder: Received block blk_-1201258788540100829 of size 29767698 from /10.251.67.214 dn_10_251_193_205 | 080722 145656 2469 INFO dfs.DataNode$PacketResponder: Received block blk_-1201258788540100829 of size 29767698 from /10.251.107.213 dn_10_251_67_214 | 080722 145656 2307 INFO dfs.DataNode$PacketResponder: Received block blk_-1201258788540100829 of size 29767698 from /10.251.67.214 dn_10_251_107_213 | 080722 145656 2498 INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_-1201258788540100829 terminating dn_10_251_193_205 | 080722 145656 2469 INFO dfs.DataNode$PacketResponder: PacketResponder 0 for block blk_-1201258788540100829 terminating dn_10_251_67_214 | 080722 145656 2307 INFO dfs.DataNode$PacketResponder: PacketResponder 2 for block blk_-1201258788540100829 terminating nn_10_251_210_161 | 080722 145656 19 INFO dfs.FSNamesystem: BLOCK* ask 10.251.67.214:50010 to replicate blk_-1201258788540100829 to datanode(s) 10.251.107.49:50010 nn_10_251_210_161 | 080722 145656 31 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.67.214:50010 is added to blk_-1201258788540100829 size 29767698 nn_10_251_210_161 | 080722 145656 31 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.107.213:50010 is added to blk_-1201258788540100829 size 29767698 nn_10_251_210_161 | 080722 145656 26 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.193.205:50010 is added to blk_-1201258788540100829 size 29767698 dn_10_251_107_49 | 080722 145658 2484 INFO dfs.DataNode$DataXceiver: Receiving block blk_-1201258788540100829 src: /10.251.67.214:42237 dest: /10.251.67.214:50010 dn_10_251_67_214 | 080722 145658 18 INFO dfs.DataNode: 10.251.67.214:50010 Starting thread to transfer block blk_-1201258788540100829 to 10.251.107.49:50010 dn_10_251_107_49 | 080722 145705 2484 INFO dfs.DataNode$DataXceiver: Received block blk_-1201258788540100829 src: /10.251.67.214:42237 dest: /10.251.67.214:50010 of size 29767698 dn_10_251_67_214 | 080722 145705 2342 INFO dfs.DataNode$DataTransfer: 10.251.67.214:50010:Transmitted block blk_-1201258788540100829 to /10.251.107.49:50010 nn_10_251_210_161 | 080722 145705 34 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.107.49:50010 is added to blk_-1201258788540100829 size 29767698 dn_10_251_107_213 | 080722 145710 19 INFO dfs.FSDataset: Deleting block blk_-1201258788540100829 file /mnt/hadoop/dfs/data/current/subdir11/blk_-1201258788540100829 nn_10_251_210_161 | 080722 150220 34 INFO dfs.FSNamesystem: BLOCK* NameSystem.delete: blk_-1201258788540100829 is added to invalidSet of 10.251.67.214:50010 nn_10_251_210_161 | 080722 150220 34 INFO dfs.FSNamesystem: BLOCK* NameSystem.delete: blk_-1201258788540100829 is added to invalidSet of 10.251.107.213:50010 nn_10_251_210_161 | 080722 150220 34 INFO dfs.FSNamesystem: BLOCK* NameSystem.delete: blk_-1201258788540100829 is added to invalidSet of 10.251.193.205:50010 nn_10_251_210_161 | 080722 150220 34 INFO dfs.FSNamesystem: BLOCK* NameSystem.delete: blk_-1201258788540100829 is added to invalidSet of
[jira] [Commented] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941109#comment-13941109 ] Hadoop QA commented on HDFS-6127: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635628/HDFS-6127.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6444//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6444//console This message is automatically generated. WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941108#comment-13941108 ] Andrew Wang commented on HDFS-6093: --- Hi Arpit and Colin, thanks for reviewing. New patch is up. Addressed your feedback except the following, and I also fixed a logging issue I found: bq. update CentralizedCacheManagement.html in the docs? Added a short blurb. A nice follow-on JIRA would be an FAQ for debugging caching, since it can be tricky right now. bq. display the pending caching/uncaching counts in the output of 'dfsadmin -report'? I think dfsadmin -report is more about usage statistics than replication work. Having the pending stats as a metric and on the webUI means it should still be easy enough to access. bq. Was stillPendingUncached introduced to fix a bug? This is required because cache reports just tell you what's cached, not also what was uncached. So, we need to compute a diff to update pendingUncached correctly. bq. ternary statement code nit I prefer not to use ternary statements, so I'd like to leave it as is if that's okay. bq. decouple the counter(s) that can be read from the CRM from the counters that the CRM uses internally With the locking issues resolved, is it okay to just leave it with a single set of variables? I could switch it over to AtomicLongs or something, but I think it's all under the FSN lock anyway. bq. colin Incrementally updating the pendingUncached list and stats is a nice idea, but it seems too ambitious for 2.4 at this point. I'm okay bumping this to 2.5 if you'd rather not put this in 2.4, but I think this all works now with the locking fixed. Expose more caching information for debugging by users -- Key: HDFS-6093 URL: https://issues.apache.org/jira/browse/HDFS-6093 Project: Hadoop HDFS Issue Type: Improvement Components: caching Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch When users submit a new cache directive, it's unclear if the NN has recognized it and is actively trying to cache it, or if it's hung for some other reason. It'd be nice to expose a pending caching/uncaching count the same way we expose pending replication work. It'd also be nice to display the aggregate cache capacity and usage in dfsadmin -report, since we already have have it as a metric and expose it per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-2076) ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(1
[ https://issues.apache.org/jira/browse/HDFS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-2076. --- Resolution: Cannot Reproduce Resolving as Cannot Reproduce. Please feel free to reopen it if it is not the case. ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(1 - Key: HDFS-2076 URL: https://issues.apache.org/jira/browse/HDFS-2076 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 0.20.2 Environment: hadoop -hdfs Reporter: chakali ranga swamy see sir datanode log socket and datasteam problem unable to upload text file to DFS i deleted tmp folders dfs and mapred again i formated hadoop namenode -format start-all.sh done then dfs folder contains: data node ,name node,secondarynamenode mapred: empty about space:- linux-8ysi:/etc/hadoop/hadoop-0.20.2 # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda5 25G 16G 7.4G 69% / udev 987M 212K 986M 1% /dev /dev/sda7 42G 5.5G 34G 14% /home --- http://localhost:50070/dfshealth.jsp-- NameNode 'localhost:54310' Started: Wed Jun 15 04:13:14 IST 2011 Version: 0.20.2, r911707 Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo Upgrades: There are no upgrades in progress. Browse the filesystem Namenode Logs Cluster Summary 10 files and directories, 0 blocks = 10 total. Heap Size is 15.5 MB / 966.69 MB (1%) Configured Capacity : 24.61 GB DFS Used : 24 KB Non DFS Used : 17.23 GB DFS Remaining : 7.38 GB DFS Used% : 0 % DFS Remaining% : 29.99 % Live Nodes : 1 Dead Nodes : 0 NameNode Storage: Storage Directory Type State /tmp/Testinghadoop/dfs/name IMAGE_AND_EDITS Active Hadoop, 2011. core-site.xml - ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namehadoop.tmp.dir/name value/tmp/Testinghadoop//value descriptionA base for other temporary directories./description /property property namefs.default.name/name valuehdfs://localhost:54310/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem./description /property /configuration hdfs-site.xml -- ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namedfs.permissions/name valuetrue/value description If true, enable permission checking in HDFS. If false, permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner, or group of files or directories. /description /property property namedfs.replication/name value1/value descriptionDefault block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. /description /property /configuration --- mapred-site.xml -- ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namemapred.job.tracker/name valuelocalhost:54311/value descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task. /description /property /configuration -- please give suggetions about this error: -- linux-8ysi:/etc/hadoop/hadoop-0.20.2/conf # hadoop fsck / RUN_JAVA /usr/java/jre1.6.0_25/bin/java .Status: HEALTHY Total size: 0 B Total dirs: 7 Total files: 1 (Files currently being written: 1) Total blocks (validated): 0 Minimally replicated blocks: 0 Over-replicated blocks: 0 Under-replicated blocks: 0 Mis-replicated blocks: 0 Default replication factor: 1 Average block replication: 0.0 Corrupt blocks: 0 Missing replicas: 0 Number of data-nodes: 1 Number of racks: 1 The filesystem under path '/' is HEALTHY linux-8ysi:/etc/hadoop/hadoop-0.20.2/conf # hadoop dfsadmin
[jira] [Resolved] (HDFS-193) DataNode exceptions reading local disk
[ https://issues.apache.org/jira/browse/HDFS-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-193. -- Resolution: Cannot Reproduce Resolving as Cannot Reproduce. Please feel free to reopen it if it is not the case. DataNode exceptions reading local disk -- Key: HDFS-193 URL: https://issues.apache.org/jira/browse/HDFS-193 Project: Hadoop HDFS Issue Type: Bug Reporter: sam rash We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks. When this occurs, I've checked on the local box's dfs.data.dir and the block is not present. Here is a relevant snippet from the logs regarding the missing block. It *looks* like the DataNode deletes the block and then tries to read it again later. NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941115#comment-13941115 ] Hadoop QA commented on HDFS-6093: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635665/hdfs-6093-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6445//console This message is automatically generated. Expose more caching information for debugging by users -- Key: HDFS-6093 URL: https://issues.apache.org/jira/browse/HDFS-6093 Project: Hadoop HDFS Issue Type: Improvement Components: caching Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch When users submit a new cache directive, it's unclear if the NN has recognized it and is actively trying to cache it, or if it's hung for some other reason. It'd be nice to expose a pending caching/uncaching count the same way we expose pending replication work. It'd also be nice to display the aggregate cache capacity and usage in dfsadmin -report, since we already have have it as a metric and expose it per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941117#comment-13941117 ] Jing Zhao commented on HDFS-6089: - bq. This is because if we don't have further operations it is possible that SBN will wait a long time to tail that part of edits which is in an in-progress segment. bq. In this scenario, the ANN will keep rolling every 2mins, generating a lot of edit log segments that aren't being cleared out. Hmm, actually my thought yesterday was not correct. Yes, we cannot do auto rolling simply based on time, and the reason is just like [~andrew.wang] pointed out. Hopefully this is my last question, just want to make sure: the current SBN auto roller can cause the same issue like a lot of edit log segments aren't being cleared out in case that checkpoints are broken (but the SBN is not down), right? Anyway I will post a patch to add rpc timeout later. Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended Key: HDFS-6089 URL: https://issues.apache.org/jira/browse/HDFS-6089 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch The following scenario was tested: * Determine Active NN and suspend the process (kill -19) * Wait about 60s to let the standby transition to active * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active. What was noticed that some times the call to get the service state of nn2 got a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6093: -- Attachment: hdfs-6093-3.patch Rebase patch Expose more caching information for debugging by users -- Key: HDFS-6093 URL: https://issues.apache.org/jira/browse/HDFS-6093 Project: Hadoop HDFS Issue Type: Improvement Components: caching Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch, hdfs-6093-3.patch When users submit a new cache directive, it's unclear if the NN has recognized it and is actively trying to cache it, or if it's hung for some other reason. It'd be nice to expose a pending caching/uncaching count the same way we expose pending replication work. It'd also be nice to display the aggregate cache capacity and usage in dfsadmin -report, since we already have have it as a metric and expose it per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-2282) Semi-harmless race between block reports and block invalidation
[ https://issues.apache.org/jira/browse/HDFS-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-2282. --- Resolution: Not A Problem I believe this is Not A Problem anymore. Please feel free to reopen it if it is not the case. Semi-harmless race between block reports and block invalidation --- Key: HDFS-2282 URL: https://issues.apache.org/jira/browse/HDFS-2282 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 0.20.203.0, 0.24.0 Reporter: Todd Lipcon Attachments: hdfs-2282-20.txt, hdfs-2282-20.txt In the 0.20.203 codebase, block reports are not synchronized in any way against mutations to the actual file structure on disk. If a file is removed from a directory while the block report is scanning that directory, it will be mistakenly reported as existing with a length of 0, since File.length() on a non-existent file returns 0. This results in an error being logged on the DataNode when the NN sends it a second block deletion request for the already-deleted block. I believe it to be harmless, but the error message can concern users. This was fixed in the 0.20 code line in HDFS-2379. This jira remains open to track the port to 0.24. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941138#comment-13941138 ] Andrew Wang commented on HDFS-6089: --- bq. the current SBN auto roller can cause the same issue like a lot of edit log segments aren't being cleared out in case that checkpoints are broken (but the SBN is not down), right? Yea, I think so. Checkpointing is pretty stable nowadays though, so I think this is pretty rare :) Thanks Jing. Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended Key: HDFS-6089 URL: https://issues.apache.org/jira/browse/HDFS-6089 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch The following scenario was tested: * Determine Active NN and suspend the process (kill -19) * Wait about 60s to let the standby transition to active * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active. What was noticed that some times the call to get the service state of nn2 got a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941149#comment-13941149 ] Jing Zhao commented on HDFS-6127: - The patch looks good to me. Some comments: # Nit: HAUtil#getServiceUriFromToken and TokenManager#getInstance need format. # javadoc of HAUtil#getServiceUriFromToken needs to be updated after the change: the method now can support URI of other FS, not just HDFS. # The change in TestWebhdfsForHA actually weakens our unit test, I think. We still need the fs.renew and fs.cancel for regression of HDFS-5339. A separate unit test with some code copy should be fine here, I guess. {code} - fs.renewDelegationToken(token); - fs.cancelDelegationToken(token); + token.renew(conf); + token.cancel(conf); {code} +1 after addressing the comments. WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941166#comment-13941166 ] Hadoop QA commented on HDFS-6093: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635673/hdfs-6093-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6446//console This message is automatically generated. Expose more caching information for debugging by users -- Key: HDFS-6093 URL: https://issues.apache.org/jira/browse/HDFS-6093 Project: Hadoop HDFS Issue Type: Improvement Components: caching Affects Versions: 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch, hdfs-6093-3.patch When users submit a new cache directive, it's unclear if the NN has recognized it and is actively trying to cache it, or if it's hung for some other reason. It'd be nice to expose a pending caching/uncaching count the same way we expose pending replication work. It'd also be nice to display the aggregate cache capacity and usage in dfsadmin -report, since we already have have it as a metric and expose it per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941171#comment-13941171 ] Jing Zhao commented on HDFS-6127: - bq. The change in TestWebhdfsForHA actually weakens our unit test, I think Actually this will not, since in the end the fs.renew and fs.cancel will still be called. Thus please just ignore this comment. WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6121) Support of mount onto HDFS directories
[ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941179#comment-13941179 ] Andrew Wang commented on HDFS-6121: --- Hi Yan, thanks for writing this JIRA. A few questions, trying to get a handle on your ideas. Each block in HDFS is pretty large (64MBs+), so there shouldn't be much seeking from a sequential file read. Asking for large I/Os does a pretty good job of giving the app full disk throughput. I've seen Impala light up a DN with 100% disk utilization across 12 disks, so I know it works in practice. An important part of this is scheduling for disk locality. I agree that remote reads will kill your throughput. As to a more flexible mapping scheme, HDFS-2832 laid the groundwork under the heterogeneous storage initiative. This will support multiple storages per DN, where each one can be a collection of one or more disks. Also potentially related is work on block affinity groups done for HBase, which helps keep the working set of an app colocated on the same set of machines. Support of mount onto HDFS directories Key: HDFS-6121 URL: https://issues.apache.org/jira/browse/HDFS-6121 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yan Currently, HDFS configuration can only create HDFS on one or several existing local file system directories. This pretty much abstracts physical disk drives away from HDFS users. While it may provide conveniences in data movement/manipulation/management/formatting, it could deprive users a way to access physical disks in a more directly controlled manner. For instance, a multi-threaded server may wish to access its disk blocks sequentially per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This could hurt hard performance of workloads than need to scan data. Map/Reduce may experience the same problem as well. For instance, HBase region servers may wish to scan disk data for each region in a sequential way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle is with HDFS's incapability to specify mappings of local directories to HDFS directories. Specifically, the dfs.data.dir configuration setting only allows for the mapping from one or multiple local directories to the HDFS root directory. In the case of data nodes of multiple disk drives mounted as multiple local file system directories per node, the HDFS data will be spread on all disk drives in a pretty random manner, potentially resulting random I/O from a multi-threaded server reading multiple data blocks from each thread. A seemingly simple enhancement is an introduction of mappings from one or multiple local FS directories to a single HDFS directory, plus necessary sanity checks, replication policies, advices of best practices, ..., etc, of course. Note that this should be an one-to-one or many-to-one mapping from local to HDFS directories. The other way around, though probably feasible, won't serve our purpose at all. This is similar to the mounting of different disks onto different local FS directories, and will give the users an option to place and access their data in a more controlled and efficient way. Conceptually this option will allow for local physical data partition in a distributed environment for application data on HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6127: - Attachment: HDFS-6127.001.patch WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch, HDFS-6127.001.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6127) WebHDFS tokens cannot be renewed in HA setup
[ https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941185#comment-13941185 ] Haohui Mai commented on HDFS-6127: -- This v1 patch addresses Jing's comment. I also use {{Mockito.verify}} to ensure that the corresponding methods are called. WebHDFS tokens cannot be renewed in HA setup Key: HDFS-6127 URL: https://issues.apache.org/jira/browse/HDFS-6127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-6127.000.patch, HDFS-6127.001.patch {{TokenAspect}} class assumes that the service name of the token is alway a host-ip pair. In HA setup, however, the service name becomes the name service id, which breaks the assumption. As a result, WebHDFS tokens cannot be renewed in HA setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5196) Provide more snapshot information in WebUI
[ https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941189#comment-13941189 ] Haohui Mai commented on HDFS-5196: -- {code} +private final String modificationTime; +private final String permission; {code} They should be long and short respectively. {code} + public SnapshottableDirectoryStatus.Bean[] getSnapshottableDirs() { ... + public SnapshotInfo.Bean[] getSnapshottedDirs() { ... {code} I think it is okay for them to return {{null}} here if no snapshottable are found. Looking at the code, it might make more sense to create a MXBean of the {{SnapshotManager}} to record all the information. The motivation is that both {{FSNamesystemState}} and {{NameNodeInfo}} are frequently queried but the snapshot information is not. That way also allows the UI makes only one HTTP call instead of two. What do you think? Provide more snapshot information in WebUI -- Key: HDFS-5196 URL: https://issues.apache.org/jira/browse/HDFS-5196 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Affects Versions: 3.0.0 Reporter: Haohui Mai Assignee: Shinichi Yamashita Priority: Minor Attachments: HDFS-5196-2.patch, HDFS-5196-3.patch, HDFS-5196-4.patch, HDFS-5196-5.patch, HDFS-5196-6.patch, HDFS-5196.patch, HDFS-5196.patch, HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, snapshotteddir.png The WebUI should provide more detailed information about snapshots, such as all snapshottable directories and corresponding number of snapshots (suggested in HDFS-4096). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended
[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6089: Attachment: HDFS-6089.002.patch Patch that adds rpc timeout for the rollEditLog call. I set the default timeout to 20s. Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended Key: HDFS-6089 URL: https://issues.apache.org/jira/browse/HDFS-6089 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Jing Zhao Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch, HDFS-6089.002.patch The following scenario was tested: * Determine Active NN and suspend the process (kill -19) * Wait about 60s to let the standby transition to active * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to active. What was noticed that some times the call to get the service state of nn2 got a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6129) When a replica is not found for deletion, do not throw exception.
Tsz Wo Nicholas Sze created HDFS-6129: - Summary: When a replica is not found for deletion, do not throw exception. Key: HDFS-6129 URL: https://issues.apache.org/jira/browse/HDFS-6129 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor It is actually a valid case if a replica is not found for deletion -- the replica may be deleted earlier. So that we should not throw exceptions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6121) Support of mount onto HDFS directories
[ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941199#comment-13941199 ] Chris Nauroth commented on HDFS-6121: - Also, some of the goals in HDFS-2832 led towards rejection of a mount-based design. We wanted to support transparent migration of files between different storage types. We also wanted to support mixed placement policies (i.e. 1 replica on SSD and 2 replicas on HDD). If storage type is tightly coupled to path, then migrating a file to a different storage type means moving it, possibly breaking applications that expect to find it at the old path. Mixed placement policies would become either impossible (if a path maps to only one storage type) or possible but difficult to manage (if a separate path is required for each desired storage combination). I agree with Andrew's comment that while utilization of the local disks is random/round-robin, large blocks and large I/O operations tend to balance this out in practice. Support of mount onto HDFS directories Key: HDFS-6121 URL: https://issues.apache.org/jira/browse/HDFS-6121 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yan Currently, HDFS configuration can only create HDFS on one or several existing local file system directories. This pretty much abstracts physical disk drives away from HDFS users. While it may provide conveniences in data movement/manipulation/management/formatting, it could deprive users a way to access physical disks in a more directly controlled manner. For instance, a multi-threaded server may wish to access its disk blocks sequentially per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This could hurt hard performance of workloads than need to scan data. Map/Reduce may experience the same problem as well. For instance, HBase region servers may wish to scan disk data for each region in a sequential way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle is with HDFS's incapability to specify mappings of local directories to HDFS directories. Specifically, the dfs.data.dir configuration setting only allows for the mapping from one or multiple local directories to the HDFS root directory. In the case of data nodes of multiple disk drives mounted as multiple local file system directories per node, the HDFS data will be spread on all disk drives in a pretty random manner, potentially resulting random I/O from a multi-threaded server reading multiple data blocks from each thread. A seemingly simple enhancement is an introduction of mappings from one or multiple local FS directories to a single HDFS directory, plus necessary sanity checks, replication policies, advices of best practices, ..., etc, of course. Note that this should be an one-to-one or many-to-one mapping from local to HDFS directories. The other way around, though probably feasible, won't serve our purpose at all. This is similar to the mounting of different disks onto different local FS directories, and will give the users an option to place and access their data in a more controlled and efficient way. Conceptually this option will allow for local physical data partition in a distributed environment for application data on HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6105: - Resolution: Fixed Fix Version/s: 2.4.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk, branch-2, and branch-2.4. Thanks for [~kihwal] and [~tthompso] to report and to verify the patch. NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6129) When a replica is not found for deletion, do not throw exception.
[ https://issues.apache.org/jira/browse/HDFS-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6129: -- Attachment: h6129_20140319.patch h6129_20140319.patch: Just log an info message when a replica is not found for deletion. When a replica is not found for deletion, do not throw exception. - Key: HDFS-6129 URL: https://issues.apache.org/jira/browse/HDFS-6129 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6129_20140319.patch It is actually a valid case if a replica is not found for deletion -- the replica may be deleted earlier. So that we should not throw exceptions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6129) When a replica is not found for deletion, do not throw exception.
[ https://issues.apache.org/jira/browse/HDFS-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6129: -- Status: Patch Available (was: Open) When a replica is not found for deletion, do not throw exception. - Key: HDFS-6129 URL: https://issues.apache.org/jira/browse/HDFS-6129 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h6129_20140319.patch It is actually a valid case if a replica is not found for deletion -- the replica may be deleted earlier. So that we should not throw exceptions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.
[ https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941229#comment-13941229 ] Hudson commented on HDFS-6105: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5361 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5361/]) HDFS-6105. NN web UI for DN list loads the same jmx page multiple times. Contributed by Haohui Mai. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1579468) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js NN web UI for DN list loads the same jmx page multiple times. - Key: HDFS-6105 URL: https://issues.apache.org/jira/browse/HDFS-6105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Kihwal Lee Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-6105.000.patch, datanodes-tab.png When loading Datanodes page of the NN web UI, the same jmx query is made multiple times. For a big cluster, that's a lot of data and overhead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6121) Support of mount onto HDFS directories
[ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941256#comment-13941256 ] Yan commented on HDFS-6121: --- Thanks to Andrew and Chris for their comments. It seems to me this proposal is orthogonal to heterogeneous storage in HDFS-2832, although I agree that mixing the two in current scheme might lead to some difficulty and confusion in terms of configuration specification and understanding of the full picture. A 2-dimensional problem is almost always much more difficult to grasp than a 1-dimensional problem. Conceptually, the distinction is fairly clear: heterogeneous storage tries to address different physical characteristics of different types of storage media; while this proposal tries to address I/O contention on each physical disk even though they are of the same type and even identical. With this distinction in mind, how to make the configuration and use of the two, particularly in combo, easy and clear is a secondary question, I believe. So the focal point is this: will this lead to random I/O or not? Note that large HDFS block size is not much relevant here, because I/O requests as viewed by disk drivers are in units of disk block sizes which are typically much smaller. In other words, HDFS block is not the I/O request unit as taken by disks, and the I/O calls as made by HDFS clients are not atomic to disks. This leaves room for random I/O between different threads. In reality, it may or may not show up in an emphatic way. But when it occurs, it could be very bad. It'd be interesting to see the Impala's detail stats, as mentioned by Andrew, along with its execution characteristics, when 100% disk utility was observed. Support of mount onto HDFS directories Key: HDFS-6121 URL: https://issues.apache.org/jira/browse/HDFS-6121 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yan Currently, HDFS configuration can only create HDFS on one or several existing local file system directories. This pretty much abstracts physical disk drives away from HDFS users. While it may provide conveniences in data movement/manipulation/management/formatting, it could deprive users a way to access physical disks in a more directly controlled manner. For instance, a multi-threaded server may wish to access its disk blocks sequentially per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This could hurt hard performance of workloads than need to scan data. Map/Reduce may experience the same problem as well. For instance, HBase region servers may wish to scan disk data for each region in a sequential way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle is with HDFS's incapability to specify mappings of local directories to HDFS directories. Specifically, the dfs.data.dir configuration setting only allows for the mapping from one or multiple local directories to the HDFS root directory. In the case of data nodes of multiple disk drives mounted as multiple local file system directories per node, the HDFS data will be spread on all disk drives in a pretty random manner, potentially resulting random I/O from a multi-threaded server reading multiple data blocks from each thread. A seemingly simple enhancement is an introduction of mappings from one or multiple local FS directories to a single HDFS directory, plus necessary sanity checks, replication policies, advices of best practices, ..., etc, of course. Note that this should be an one-to-one or many-to-one mapping from local to HDFS directories. The other way around, though probably feasible, won't serve our purpose at all. This is similar to the mounting of different disks onto different local FS directories, and will give the users an option to place and access their data in a more controlled and efficient way. Conceptually this option will allow for local physical data partition in a distributed environment for application data on HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API
[ https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941266#comment-13941266 ] Haohui Mai commented on HDFS-5978: -- {code} + private static class FSImageLoader { +private static String[] stringTable; {code} Please put {{FSImageLoader}} in a separate file. The handler can take a reference of {{FSImageLoader}}. {code} +import org.apache.hadoop.hdfs.web.JsonUtil; +import org.apache.hadoop.hdfs.web.resources.GetOpParam; {code} It is better to avoid the dependency. Can you experiement with {{Jackson#ObjectMapper}}? {code} + Path path = new Path(strPath); + StackString stack = new StackString(); + while (path.getParent() != null) { +stack.push(path.getName()); +path = path.getParent(); + } {code} I think {{String.split(/)}} is sufficient here. {code} + + -host specifies the host to listen. (0.0.0.0 by default)\n + + -port specifies the port to listen. (0 by default)\n {code} It might be simpler to just have one option that takes a {{InetSocketAddress}} directly. {code} + public void initServer() throws IOException {... {code} The function should looks like this: {code} public void initServer(String fsimage) { FSImageLoader image = FSImageLoader.load(fsimage); // construct the pipeline // bind } {code} That way all the locks / condition variables / sleeps are no longer required in the unit test. Create a tool to take fsimage and expose read-only WebHDFS API -- Key: HDFS-5978 URL: https://issues.apache.org/jira/browse/HDFS-5978 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: newbie Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch Suggested in HDFS-5975. Add an option to exposes the read-only version of WebHDFS API for OfflineImageViewer. You can imagine it looks very similar to jhat. That way we can allow the operator to use the existing command-line tool, or even the web UI to debug the fsimage. It also allows the operator to interactively browsing the file system, figuring out what goes wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6121) Support of mount onto HDFS directories
[ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941280#comment-13941280 ] Andrew Wang commented on HDFS-6121: --- We actually have an unstable public API for querying on which disk a block resides (see {{DistributedFileSystem#getFileBlockStorageLocations}}), which might be along the lines of what you're asking for. Impala uses this to schedule on a per-disk basis, and we saw a big jump up in disk utilization after we implemented this. The limitations is that it doesn't fix I/O contention from other users besides Impala, but it works well within a single app. Writes are less of a problem unless you're calling fsync, since they're buffered by the OS. With regard to interleaving, my experience has been that as long as an app is doing large reads (meaning 4MB or 8MB), it'll get good performance. I'm not sure how request splitting in the disk driver works, but I imagine the elevator algorithms at various levels will try to serve a large read as a bunch of contiguous smaller requests. Support of mount onto HDFS directories Key: HDFS-6121 URL: https://issues.apache.org/jira/browse/HDFS-6121 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yan Currently, HDFS configuration can only create HDFS on one or several existing local file system directories. This pretty much abstracts physical disk drives away from HDFS users. While it may provide conveniences in data movement/manipulation/management/formatting, it could deprive users a way to access physical disks in a more directly controlled manner. For instance, a multi-threaded server may wish to access its disk blocks sequentially per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This could hurt hard performance of workloads than need to scan data. Map/Reduce may experience the same problem as well. For instance, HBase region servers may wish to scan disk data for each region in a sequential way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle is with HDFS's incapability to specify mappings of local directories to HDFS directories. Specifically, the dfs.data.dir configuration setting only allows for the mapping from one or multiple local directories to the HDFS root directory. In the case of data nodes of multiple disk drives mounted as multiple local file system directories per node, the HDFS data will be spread on all disk drives in a pretty random manner, potentially resulting random I/O from a multi-threaded server reading multiple data blocks from each thread. A seemingly simple enhancement is an introduction of mappings from one or multiple local FS directories to a single HDFS directory, plus necessary sanity checks, replication policies, advices of best practices, ..., etc, of course. Note that this should be an one-to-one or many-to-one mapping from local to HDFS directories. The other way around, though probably feasible, won't serve our purpose at all. This is similar to the mounting of different disks onto different local FS directories, and will give the users an option to place and access their data in a more controlled and efficient way. Conceptually this option will allow for local physical data partition in a distributed environment for application data on HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5910) Enhance DataTransferProtocol to support encrypted and plain-text communication
[ https://issues.apache.org/jira/browse/HDFS-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5910: Status: Patch Available (was: Open) Enhance DataTransferProtocol to support encrypted and plain-text communication -- Key: HDFS-5910 URL: https://issues.apache.org/jira/browse/HDFS-5910 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5910.patch It is possible to enable encryption of DataTransferProtocol. In some use cases, it is required to encrypt data transfer with some clients , but communicate in plain text with some other clients and datanodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5910) Enhance DataTransferProtocol to support encrypted and plain-text communication
[ https://issues.apache.org/jira/browse/HDFS-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941295#comment-13941295 ] Arpit Agarwal commented on HDFS-5910: - [~benoyantony] is this ready for review? I noticed you hadn't submitted the patch. Enhance DataTransferProtocol to support encrypted and plain-text communication -- Key: HDFS-5910 URL: https://issues.apache.org/jira/browse/HDFS-5910 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5910.patch It is possible to enable encryption of DataTransferProtocol. In some use cases, it is required to encrypt data transfer with some clients , but communicate in plain text with some other clients and datanodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion
[ https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941307#comment-13941307 ] Tsz Wo Nicholas Sze commented on HDFS-6038: --- +1 HDFS-6038.008.patch looks good. Allow JournalNode to handle editlog produced by new release with future layoutversion - Key: HDFS-6038 URL: https://issues.apache.org/jira/browse/HDFS-6038 Project: Hadoop HDFS Issue Type: Sub-task Components: journal-node, namenode Reporter: Haohui Mai Assignee: Jing Zhao Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, HDFS-6038.008.patch, editsStored In HA setup, the JNs receive edit logs (blob) from the NN and write into edit log files. In order to write well-formed edit log files, the JNs prepend a header for each edit log file. The problem is that the JN hard-codes the version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during rolling upgrade. In the meanwhile, currently JN tries to decode the in-progress editlog segment in order to know the last txid in the segment. In the rolling upgrade scenario, the JN with the old software may not be able to correctly decode the editlog generated by the new software. This jira makes the following changes to allow JN to handle editlog produced by software with future layoutversion: 1. Change the NN--JN startLogSegment RPC signature and let NN specify the layoutversion for the new editlog segment. 2. Persist a length field for each editlog op to indicate the total length of the op. Instead of calling EditLogFileInputStream#validateEditLog to get the last txid of an in-progress editlog segment, a new method scanEditLog is added and used by JN which does not decode each editlog op but uses the length to quickly jump to the next op. -- This message was sent by Atlassian JIRA (v6.2#6252)