[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781617#comment-13781617 ] Brandon Li commented on HDFS-5256: -- I've committed the patch. Thank you, Haohui, for the contribution! Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781621#comment-13781621 ] Hudson commented on HDFS-5256: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4496 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4496/]) HDFS-5256. Use guava LoadingCache to implement DFSClientCache. Contributed by Haohui Mai (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527452) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/LruCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestDFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5256: - Resolution: Fixed Status: Resolved (was: Patch Available) Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5256: - Fix Version/s: 2.1.2-beta Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781656#comment-13781656 ] Binglin Chang commented on HDFS-5276: - Reading so many HDFS files in parallel in one process is not a normal case, maybe we can: 1. try some optimization tricks mentioned in HADOOP-5318 2. statistics is mainly used in MapReduce job report, if spark(or other framework) don't need it, we can make this feature configurable, and disable it in many thread case. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-5276: Attachment: DisableFSReadWriteBytesStat.patch Here is a patch try to make read/write bytes stat configurable, by default it still stat, change fs.read.write.bytes.stat to false to disable it. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781738#comment-13781738 ] Hudson commented on HDFS-5256: -- FAILURE: Integrated in Hadoop-Yarn-trunk #348 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/348/]) HDFS-5256. Use guava LoadingCache to implement DFSClientCache. Contributed by Haohui Mai (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527452) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/LruCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestDFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781798#comment-13781798 ] Hudson commented on HDFS-5256: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1538 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1538/]) HDFS-5256. Use guava LoadingCache to implement DFSClientCache. Contributed by Haohui Mai (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527452) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/LruCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestDFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5256) Use guava LoadingCache to implement DFSClientCache
[ https://issues.apache.org/jira/browse/HDFS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781848#comment-13781848 ] Hudson commented on HDFS-5256: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1564 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1564/]) HDFS-5256. Use guava LoadingCache to implement DFSClientCache. Contributed by Haohui Mai (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527452) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/LruCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestDFSClientCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Use guava LoadingCache to implement DFSClientCache -- Key: HDFS-5256 URL: https://issues.apache.org/jira/browse/HDFS-5256 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5256.000.patch, HDFS-5256.001.patch, HDFS-5256.002.patch, HDFS-5256.003.patch, HDFS-5256.004.patch Google Guava provides an implementation of LoadingCache. Use the LoadingCache to implement DFSClientCache in NFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: HDFS-5230.008.patch Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782043#comment-13782043 ] Hadoop QA commented on HDFS-5230: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605937/HDFS-5230.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5063//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5063//console This message is automatically generated. Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5277) hadoop fs -expunge does not work for federated namespace
Vrushali C created HDFS-5277: Summary: hadoop fs -expunge does not work for federated namespace Key: HDFS-5277 URL: https://issues.apache.org/jira/browse/HDFS-5277 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Vrushali C We noticed that hadoop fs -expunge command does not work across federated namespace. This seems to look at only /user/username/.Trash instead of traversing all available namespace and expunging from individual namespace. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5277) hadoop fs -expunge does not work for federated namespace
[ https://issues.apache.org/jira/browse/HDFS-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782046#comment-13782046 ] Vrushali C commented on HDFS-5277: -- Trying to expose federated namespace from within FSShell to get expunge working might be tricky and not very clean. Passing a filesystem argument to the expunge works and results in creating a timestamp-named directory inside trash and moves the contents under current to that dir. For example, if one of the namespaces is called namespace1 in the cluster, a command like the following will work: {noformat} $hadoop fs -fs hdfs://namespace1/ -expunge 13/09/27 23:07:06 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/hadoop/.Trash/130927230706 $ hadoop fs -ls hdfs://namespace1/user/hadoop/.Trash Found 1 items drwx-- - hadoop hadoop 0 2013-09-27 22:59 hdfs://namespace1/user/hadoop/.Trash/130927230706 $ {noformat} hadoop fs -expunge does not work for federated namespace - Key: HDFS-5277 URL: https://issues.apache.org/jira/browse/HDFS-5277 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Vrushali C We noticed that hadoop fs -expunge command does not work across federated namespace. This seems to look at only /user/username/.Trash instead of traversing all available namespace and expunging from individual namespace. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5278) Reduce memory consumptions of TestDFSClientRetries
Haohui Mai created HDFS-5278: Summary: Reduce memory consumptions of TestDFSClientRetries Key: HDFS-5278 URL: https://issues.apache.org/jira/browse/HDFS-5278 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Minor TestDFSClientRetries::testDFSClientRetriesOnBusyBlocks() spawns about 50 threads during the execution, each of which takes more than 6m memory. It makes debugging it in eclipse under the default settings difficult since it triggers the OutOfMemoryException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782058#comment-13782058 ] Colin Patrick McCabe commented on HDFS-5276: Why not keep thread-local read statistics and sum them up periodically? That seems better than disabling this entirely. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5278) Reduce memory consumptions of TestDFSClientRetries
[ https://issues.apache.org/jira/browse/HDFS-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5278: - Status: Patch Available (was: Open) Reduce memory consumptions of TestDFSClientRetries -- Key: HDFS-5278 URL: https://issues.apache.org/jira/browse/HDFS-5278 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Minor Attachments: HDFS-5278.000.patch TestDFSClientRetries::testDFSClientRetriesOnBusyBlocks() spawns about 50 threads during the execution, each of which takes more than 6m memory. It makes debugging it in eclipse under the default settings difficult since it triggers the OutOfMemoryException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5278) Reduce memory consumptions of TestDFSClientRetries
[ https://issues.apache.org/jira/browse/HDFS-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5278: - Attachment: HDFS-5278.000.patch Reduce memory consumptions of TestDFSClientRetries -- Key: HDFS-5278 URL: https://issues.apache.org/jira/browse/HDFS-5278 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Minor Attachments: HDFS-5278.000.patch TestDFSClientRetries::testDFSClientRetriesOnBusyBlocks() spawns about 50 threads during the execution, each of which takes more than 6m memory. It makes debugging it in eclipse under the default settings difficult since it triggers the OutOfMemoryException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (HDFS-5272) DFSUtil#getInfoServer cleanup
[ https://issues.apache.org/jira/browse/HDFS-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780635#comment-13780635 ] Suresh Srinivas edited comment on HDFS-5272 at 9/30/13 6:10 PM: Hi Jing, I can make the change too if you are busy. was (Author: arpitagarwal): Hi Jing, I can make the change too if you are busy. Thanks. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. DFSUtil#getInfoServer cleanup - Key: HDFS-5272 URL: https://issues.apache.org/jira/browse/HDFS-5272 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Jing Zhao The {{httpsAddress}} parameter is not used outside of tests. {code} public static String getInfoServer(InetSocketAddress namenodeAddr, Configuration conf, boolean httpsAddress) throws IOException { boolean securityOn = UserGroupInformation.isSecurityEnabled(); String httpAddressKey = (securityOn httpsAddress) ? DFS_NAMENODE_HTTPS_ADDRESS_KEY : DFS_NAMENODE_HTTP_ADDRESS_KEY; String httpAddressDefault = (securityOn httpsAddress) ? DFS_NAMENODE_HTTPS_ADDRESS_DEFAULT : DFS_NAMENODE_HTTP_ADDRESS_DEFAULT; {code} [~jingzhao] confirmed on HDFS-5271 that DFS_NAMENODE_HTTP_ADDRESS_DEFAULT will be the default that is always available so there should be no reason to choose betwen DFS_NAMENODE_HTTP_ADDRESS_DEFAULT and DFS_NAMENODE_HTTPS_ADDRESS_DEFAULT. Filing Jira to clean this up. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: (was: HDFS-5230.009.patch) Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: HDFS-5230.009.patch Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: HDFS-5230.009.patch Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: HDFS-5230.009.patch Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5230: - Attachment: (was: HDFS-5230.009.patch) Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782095#comment-13782095 ] Hadoop QA commented on HDFS-5230: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605945/HDFS-5230.009.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5065//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5065//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5065//console This message is automatically generated. Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782099#comment-13782099 ] Luke Lu commented on HDFS-4885: --- I think this API change is problematic for several reasons: # Loss of useful information by replacing the rack diff with a boolean. We should preserve existing information as much as possible if we're generalizing the API. # Not general enough. minRacks is not general enough for a generic API. # srcPath is useful for info/debug status. Suggestion: {code} public BlockPlacementStatus verifyBlockPlacement(String srcPath, LocatedBlock blk, BlockPlacementConstraints constraints); {code} Since override is covariant on return type (since java 5), new BlockPlacementStatus can be implemented for new BlockPlacementPolicy later on without any information loss. Update verifyBlockPlacement() API in BlockPlacementPolicy - Key: HDFS-4885 URL: https://issues.apache.org/jira/browse/HDFS-4885 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Junping Du Assignee: Junping Du Labels: BlockPlacementPolicy Attachments: HDFS-4885.patch verifyBlockPlacement() has unused parameter -srcPath as its responsibility just verify single block rather than files under a specific path. Also the return value (int) does not make sense as the violation of block placement has other case than number of racks, so boolean value should be better. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
Chris Nauroth created HDFS-5279: --- Summary: Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.1.1-beta, 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5279: Target Version/s: 2.1.1-beta, 3.0.0 (was: 3.0.0, 2.1.1-beta) Status: Patch Available (was: Open) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.1.1-beta, 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5279: Attachment: HDFS-5279.1.patch I'm attaching a patch to add guards against a null {{FSNamesystem}}. I did a full review of the JSP pages, and I think this patch covers everything. I also skipped rendering the Browse filesystem hyperlink when the {{FSNamesystem}} is null, because there is no way for that page to do anything meaningful before initialization of the {{FSNamesystem}}. I've added several tests that pass null to the {{NamenodeJspHelper}} methods. I also manually tested a browser opening all of these pages before the {{FSNamesystem}} is initialized. Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782107#comment-13782107 ] Andrew Wang commented on HDFS-5276: --- +1 for Colin's suggestion, that was my first thought too. Increment thread-locals, sum them up on read. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782110#comment-13782110 ] Brandon Li commented on HDFS-5230: -- +1. The eclipse and release audit warnings are build environment issues. Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782118#comment-13782118 ] Brandon Li commented on HDFS-5230: -- I've committed the patch. Thank you, Haohui, for the contribution! Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782122#comment-13782122 ] Suresh Srinivas commented on HDFS-5279: --- [~cnauroth], how about making a change in the web UI to say NameNode is initializing instead of giving either empty string or 0 as count when FSNamesystem is null? Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782126#comment-13782126 ] Suresh Srinivas commented on HDFS-5276: --- bq. Increment thread-locals, sum them up on read. How do you know all the threads that are maintaining thread local variables? FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782127#comment-13782127 ] Suresh Srinivas commented on HDFS-5276: --- bq. about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). BTW I am really surprised by the 70% number. Given that the client is reading the data, doing CRC validation, this number seems too high. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782137#comment-13782137 ] Hudson commented on HDFS-5230: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4498 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4498/]) HDFS-5230. Introduce RpcInfo to decouple XDR classes from the RPC API. Contributed by Haohui Mai (brandonli: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527726) * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Base.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcCallCache.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcProgram.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcResponse.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/SimpleTcpServer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/SimpleTcpServerHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/SimpleUdpServer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/SimpleUdpServerHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/XDR.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Verifier.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/RpcProgramPortmap.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestFrameDecoder.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcCallCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5230: - Resolution: Fixed Status: Resolved (was: Patch Available) Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5230) Introduce RpcInfo to decouple XDR classes from the RPC API
[ https://issues.apache.org/jira/browse/HDFS-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5230: - Fix Version/s: 2.1.2-beta Introduce RpcInfo to decouple XDR classes from the RPC API -- Key: HDFS-5230 URL: https://issues.apache.org/jira/browse/HDFS-5230 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5230.002.patch, HDFS-5230.003.patch, HDFS-5230.004.patch, HDFS-5230.005.patch, HDFS-5230.006.patch, HDFS-5230.007.patch, HDFS-5230.008.patch, HDFS-5230.009.patch The XDR class is one fundamental aspect in the current implementation of NFS server. While the client might potentially have a higher level APIs, it also requires redundant copying since the upstream clients have insufficient information. This JIRA introduces a new class, RpcInfo, which (1) decouples XDR from the APIs, turning it into a utility class, and (2) exposes ChannelBuffer directly to the client in order to open the opportunity for avoid copying. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5278) Reduce memory consumptions of TestDFSClientRetries
[ https://issues.apache.org/jira/browse/HDFS-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782174#comment-13782174 ] Hadoop QA commented on HDFS-5278: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605940/HDFS-5278.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5064//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5064//console This message is automatically generated. Reduce memory consumptions of TestDFSClientRetries -- Key: HDFS-5278 URL: https://issues.apache.org/jira/browse/HDFS-5278 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Minor Attachments: HDFS-5278.000.patch TestDFSClientRetries::testDFSClientRetriesOnBusyBlocks() spawns about 50 threads during the execution, each of which takes more than 6m memory. It makes debugging it in eclipse under the default settings difficult since it triggers the OutOfMemoryException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
[ https://issues.apache.org/jira/browse/HDFS-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-5190: - Assignee: Andrew Wang move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI - Key: HDFS-5190 URL: https://issues.apache.org/jira/browse/HDFS-5190 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Colin Patrick McCabe Assignee: Andrew Wang As per the discussion in HDFS-5158, we should move the cache pool add, remove, list commands into cacheadmin. We also should write a unit test in TestHDFSCLI for these commands. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782191#comment-13782191 ] Daryn Sharp commented on HDFS-4311: --- [~tucu00], there's a push to get a slew of test coverages jiras committed. Since this is a manual test with a hardcoded /Users/tucu/tucu.keytab, could you please review/commit? repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos --- Key: HDFS-4311 URL: https://issues.apache.org/jira/browse/HDFS-4311 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.3-alpha Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Attachments: HDFS-4311.patch Some of the test cases in this test class are failing because they are affected by static state changed by the previous test cases. Namely this is the static field org.apache.hadoop.security.UserGroupInformation.loginUser . The suggested patch solves this problem. Besides, the following improvements are done: 1) parametrized the user principal and keytab values via system properties; 2) shutdown of the Jetty server and the minicluster between the test cases is added to make the test methods independent on each other. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
[ https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782193#comment-13782193 ] Daryn Sharp commented on HDFS-4312: --- This needs to be rebased using hadoop-minikdc. fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc -- Key: HDFS-4312 URL: https://issues.apache.org/jira/browse/HDFS-4312 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Attachments: HDFS-4312.patch TestSecureNameNode does not work on Java6 without dfs.web.authentication.kerberos.principal config property set. Also the following improved: 1) keytab files are checked for existence and readability to provide fast-fail on config error. 2) added comment to TestSecureNameNode describing the required sys props. 3) string literals replaced with config constants. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
[ https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782209#comment-13782209 ] Kihwal Lee commented on HDFS-4510: -- Sorry it seems the patch got stale again. Would you refresh it again? We will try to get it in this time. Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests Key: HDFS-4510 URL: https://issues.apache.org/jira/browse/HDFS-4510 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4510-branch-0.23-a.patch, HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-0.23-c.patch, HADOOP-4510-branch-2-a.patch, HADOOP-4510-branch-2-b.patch, HADOOP-4510-branch-2-c.patch, HADOOP-4510-trunk-a.patch, HADOOP-4510-trunk-b.patch, HADOOP-4510-trunk-c.patch, HDFS-4510-trunk--N27.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4511) Cover package org.apache.hadoop.hdfs.tools with unit test
[ https://issues.apache.org/jira/browse/HDFS-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1378#comment-1378 ] Kihwal Lee commented on HDFS-4511: -- The patch looks good in general, but I could not run it. It applies but won't build. Would you refresh the patch? Cover package org.apache.hadoop.hdfs.tools with unit test - Key: HDFS-4511 URL: https://issues.apache.org/jira/browse/HDFS-4511 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4511-branch-0.23-a.patch, HADOOP-4511-branch-2-a.patch, HADOOP-4511-trunk-a.patch, HDFS-4511-branch-2--N2.patch, HDFS-4511-trunk--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5279: Attachment: HDFS-5279.2.patch Thanks, [~sureshms]. I'm attaching version 2 of the patch, which sets an initializing label and refactors some duplicated code for generating the label behind a helper method. Regarding the 0 counts, this was part of the XML output, and I figured it was safer to emit the exact same XML structure with 0 values, just in case of compatibility problems with anything that parses that XML and isn't equipped to handle missing elements or new elements. Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch, HDFS-5279.2.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4517) Cover class RemoteBlockReader with unit tests
[ https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782244#comment-13782244 ] Kihwal Lee commented on HDFS-4517: -- +1 The patch looks reasonable. I've built and run it with -Pnative and the result looks as expected. Cover class RemoteBlockReader with unit tests - Key: HDFS-4517 URL: https://issues.apache.org/jira/browse/HDFS-4517 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4517-branch-0.23-a.patch, HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, HDFS-4517-branch-2--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782247#comment-13782247 ] Hadoop QA commented on HDFS-5279: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605950/HDFS-5279.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5066//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5066//console This message is automatically generated. Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch, HDFS-5279.2.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782253#comment-13782253 ] Alejandro Abdelnur commented on HDFS-4311: -- The right fix is converting these tests to use hadoop-minikdc repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos --- Key: HDFS-4311 URL: https://issues.apache.org/jira/browse/HDFS-4311 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.3-alpha Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Attachments: HDFS-4311.patch Some of the test cases in this test class are failing because they are affected by static state changed by the previous test cases. Namely this is the static field org.apache.hadoop.security.UserGroupInformation.loginUser . The suggested patch solves this problem. Besides, the following improvements are done: 1) parametrized the user principal and keytab values via system properties; 2) shutdown of the Jetty server and the minicluster between the test cases is added to make the test methods independent on each other. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.
Jinghui Wang created HDFS-5280: -- Summary: Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node. Key: HDFS-5280 URL: https://issues.apache.org/jira/browse/HDFS-5280 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs-client Affects Versions: 2.0.4-alpha, 2.1.0-beta, 1.1.1 Environment: Red hat enterprise 6.4 Hadoop-2.1.0 Reporter: Jinghui Wang Meta files being corrupted causes the DFSClient not able to connect to the datanodes to access the blocks, so DFSClient never perform a read on the block, which is what throws the ChecksumException when file blocks are corrupted and report to the namenode to mark the block as corrupt. Since the client never got to that far, thus the file status remain as healthy and so are all the blocks. To replicate the error, put a file onto HDFS. run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that following output. FSCK started for path /tmp/bogus.csv at 11:33:29 /tmp/bogus.csv 109 bytes, 1 block(s): OK 0. blk_-4255166695856420554_5292 len=109 repl=3 find the block/meta files for 4255166695856420554 by running ssh datanode1.address find /hadoop/ -name *4255166695856420554* and it will get the following output: /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554 /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta now corrupt the meta file by running ssh datanode1.address sed -i -e '1i 1234567891' /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta now run hadoop fs -cat /tmp/bogus.csv will show the stack trace of DFSClient failing to connect to the data node with the corrupted meta file. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5281) COMMIT request should not block
Brandon Li created HDFS-5281: Summary: COMMIT request should not block Key: HDFS-5281 URL: https://issues.apache.org/jira/browse/HDFS-5281 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Brandon Li Currently Commit request is handled synchronously, blocked at most 30 seconds before timeout. This JIRA is to make is asynchronous and thus it won't block other requests coming from the same channel. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782307#comment-13782307 ] Colin Patrick McCabe commented on HDFS-5276: bq. How do you know all the threads that are maintaining thread local variables? The first time a thread tries to access a thread-local-variable, it will get null. At that point, the thread creates the thread-local counters object, takes a mutex, and adds a reference to it to the list inside FileSystem. Periodically, we go over the list of thread-locals and sum them up into a total. (We also do that summation when reading statistics). At that point, we remove any thread-locals which belong to threads which no longer exist. Check out the flat combining paper, which is a more abstract description of this idea: http://www.cs.bgu.ac.il/~hendlerd/papers/flat-combining.pdf FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4273) Problem in DFSInputStream read retry logic may cause early failure
[ https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782326#comment-13782326 ] Luke Lu commented on HDFS-4273: --- Looks like the patch has gone stale... Problem in DFSInputStream read retry logic may cause early failure -- Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: HDFS-4273.patch, HDFS-4273-v2.patch, HDFS-4273.v3.patch, TestDFSInputStream.java Assume the following call logic {noformat} readWithStrategy() - blockSeekTo() - readBuffer() - reader.doRead() - seekToNewSource() add currentNode to deadnode, wish to get a different datanode - blockSeekTo() - chooseDataNode() - block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} some issues of the logic: 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle. 2. the variable int retries=2 in readWithStrategy seems have conflict with MaxBlockAcquireFailures, should it be removed? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4511) Cover package org.apache.hadoop.hdfs.tools with unit test
[ https://issues.apache.org/jira/browse/HDFS-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782339#comment-13782339 ] Daryn Sharp commented on HDFS-4511: --- TestDelegationTokenRemoteFetcher: # Don't hardcode a port number, it prevents concurrent test execution on the same host # Negative cases are only checking for a non-IOException. If it erroneously succeeds, the test doesn't appear to fail # The tests with a mocked up NIO http server appear to only test that the client contacted the server. It just returns canned responses irregardless of the actual request made. Ie. it doesn't check that the client sent the correct parameters, esp. the correct token in the correct format. Cover package org.apache.hadoop.hdfs.tools with unit test - Key: HDFS-4511 URL: https://issues.apache.org/jira/browse/HDFS-4511 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4511-branch-0.23-a.patch, HADOOP-4511-branch-2-a.patch, HADOOP-4511-trunk-a.patch, HDFS-4511-branch-2--N2.patch, HDFS-4511-trunk--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782347#comment-13782347 ] Suresh Srinivas commented on HDFS-5276: --- bq. At that point, we remove any thread-locals which belong to threads which no longer exist. The counts from the threads, even though they are not running any more, should be included in stats count. Currently statistics object is passed from the client to the file system. This implementation may need incompatible changes. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-4517) Cover class RemoteBlockReader with unit tests
[ https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-4517: - Assignee: Dennis Y Cover class RemoteBlockReader with unit tests - Key: HDFS-4517 URL: https://issues.apache.org/jira/browse/HDFS-4517 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Assignee: Dennis Y Attachments: HADOOP-4517-branch-0.23-a.patch, HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, HDFS-4517-branch-2--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-4517) Cover class RemoteBlockReader with unit tests
[ https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-4517: - Resolution: Fixed Fix Version/s: 2.3.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks for the patches, Vadim and Dennis. Cover class RemoteBlockReader with unit tests - Key: HDFS-4517 URL: https://issues.apache.org/jira/browse/HDFS-4517 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Assignee: Dennis Y Fix For: 3.0.0, 2.3.0 Attachments: HADOOP-4517-branch-0.23-a.patch, HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, HDFS-4517-branch-2--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4512) Cover package org.apache.hadoop.hdfs.server.common with tests
[ https://issues.apache.org/jira/browse/HDFS-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782365#comment-13782365 ] Kihwal Lee commented on HDFS-4512: -- +1 looks good to me. Cover package org.apache.hadoop.hdfs.server.common with tests - Key: HDFS-4512 URL: https://issues.apache.org/jira/browse/HDFS-4512 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4512-branch-0.23-a.patch, HADOOP-4512-branch-2-a.patch, HADOOP-4512-trunk-a.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4517) Cover class RemoteBlockReader with unit tests
[ https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782375#comment-13782375 ] Hudson commented on HDFS-4517: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4500 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4500/]) HDFS-4517. Cover class RemoteBlockReader with unit tests. Contributed by Vadim Bondarev and Dennis Y. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527807) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java Cover class RemoteBlockReader with unit tests - Key: HDFS-4517 URL: https://issues.apache.org/jira/browse/HDFS-4517 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Assignee: Dennis Y Fix For: 3.0.0, 2.3.0 Attachments: HADOOP-4517-branch-0.23-a.patch, HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, HDFS-4517-branch-2--N4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782394#comment-13782394 ] Haohui Mai commented on HDFS-5276: -- [~chengxiang li], can you post the detailed configuration of the test, for example, what kinds of cpu are you using to run the test? And can you test the differences of end-to-end latency before and after the patch? Although this JIRA do bring up a good point, it seems that many of us are still yet to be convinced that this is a real problem in production settings. Since there're many research proposals available since 80's, I believe that it will be straightforward to apply one of them once the community is convinced that it affects the performance significantly, by a detailed analysis of the behavior of real-world Hadoop workload. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5279) Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem.
[ https://issues.apache.org/jira/browse/HDFS-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782395#comment-13782395 ] Hadoop QA commented on HDFS-5279: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605967/HDFS-5279.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5067//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5067//console This message is automatically generated. Guard against NullPointerException in NameNode JSP pages before initialization of FSNamesystem. --- Key: HDFS-5279 URL: https://issues.apache.org/jira/browse/HDFS-5279 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-5279.1.patch, HDFS-5279.2.patch HDFS-4372 added tracking of NameNode startup progress. As part of that change, the NameNode HTTP server now starts before initialization of the {{FSNamesystem}}. There are a few code paths remaining in the JSP pages that are at risk of causing {{NullPointerException}} if accessed when the {{FSNamesystem}} has not been fully initialized. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4885) Update verifyBlockPlacement() API in BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782414#comment-13782414 ] Junping Du commented on HDFS-4885: -- Thanks Luke for suggestion here. That's definitely much better. For BlockPlacementStatus, I think it could contains two field like below {code} int misReplicated String misRepInfo {code} So the first field can replace current missingRacks to make sure no loss info and second field is used in NamenodeFsck to add into print directly. For BlockPlacementConstraints, I'd like to design it as a POJO, and different BlockPlacementPolicy get different properties here like BPPDefault get minRacks and BPPWithNodeGroup get minRacks and minNodeGroups. How does it sounds like? Update verifyBlockPlacement() API in BlockPlacementPolicy - Key: HDFS-4885 URL: https://issues.apache.org/jira/browse/HDFS-4885 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Junping Du Assignee: Junping Du Labels: BlockPlacementPolicy Attachments: HDFS-4885.patch verifyBlockPlacement() has unused parameter -srcPath as its responsibility just verify single block rather than files under a specific path. Also the return value (int) does not make sense as the violation of block placement has other case than number of racks, so boolean value should be better. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5263) Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp
[ https://issues.apache.org/jira/browse/HDFS-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasu Mariyala updated HDFS-5263: Status: Patch Available (was: Open) Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp -- Key: HDFS-5263 URL: https://issues.apache.org/jira/browse/HDFS-5263 Project: Hadoop HDFS Issue Type: Bug Components: namenode, webhdfs Reporter: Vasu Mariyala Attachments: HDFS-5263.patch When Kerberos authentication is enabled, we are unable to browse to the data nodes using ( Name node web page -- Live Nodes -- Select any of the data nodes). The reason behind this is the delegation token is not provided as part of the url in the method (generateNodeDataHeader method of NodeListJsp) {code} String url = HttpConfig.getSchemePrefix() + d.getHostName() + : + d.getInfoPort() + /browseDirectory.jsp?namenodeInfoPort= + nnHttpPort + dir= + URLEncoder.encode(/, UTF-8) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, nnaddr); {code} But browsing the file system using name node web page -- Browse the file system - any directory is working fine as the redirectToRandomDataNode method of NamenodeJspHelper creates the delegation token {code} redirectLocation = HttpConfig.getSchemePrefix() + fqdn + : + redirectPort + /browseDirectory.jsp?namenodeInfoPort= + nn.getHttpAddress().getPort() + dir=/ + (tokenString == null ? : JspHelper.getDelegationTokenUrlParam(tokenString)) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, addr); {code} I will work on providing a patch for this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5263) Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp
[ https://issues.apache.org/jira/browse/HDFS-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasu Mariyala updated HDFS-5263: Attachment: HDFS-5263.patch Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp -- Key: HDFS-5263 URL: https://issues.apache.org/jira/browse/HDFS-5263 Project: Hadoop HDFS Issue Type: Bug Components: namenode, webhdfs Reporter: Vasu Mariyala Attachments: HDFS-5263.patch When Kerberos authentication is enabled, we are unable to browse to the data nodes using ( Name node web page -- Live Nodes -- Select any of the data nodes). The reason behind this is the delegation token is not provided as part of the url in the method (generateNodeDataHeader method of NodeListJsp) {code} String url = HttpConfig.getSchemePrefix() + d.getHostName() + : + d.getInfoPort() + /browseDirectory.jsp?namenodeInfoPort= + nnHttpPort + dir= + URLEncoder.encode(/, UTF-8) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, nnaddr); {code} But browsing the file system using name node web page -- Browse the file system - any directory is working fine as the redirectToRandomDataNode method of NamenodeJspHelper creates the delegation token {code} redirectLocation = HttpConfig.getSchemePrefix() + fqdn + : + redirectPort + /browseDirectory.jsp?namenodeInfoPort= + nn.getHttpAddress().getPort() + dir=/ + (tokenString == null ? : JspHelper.getDelegationTokenUrlParam(tokenString)) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, addr); {code} I will work on providing a patch for this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782443#comment-13782443 ] Haohui Mai commented on HDFS-5276: -- And I'm still trying to understand the rational of addressing architectural problems in DFSClient. Controlling issues such as cache alignments, synchronization from JVM are also essential to avoid contentions. Since the information is simply unavailable to Java programs, in my personal opinions the problem might be better addressed in the JVM, or even lower abstraction levels. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HDFS-5274: Affects Version/s: 2.1.1-beta Status: Patch Available (was: Open) Add Tracing to HDFS --- Key: HDFS-5274 URL: https://issues.apache.org/jira/browse/HDFS-5274 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Affects Versions: 2.1.1-beta Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch Since Google's Dapper paper has shown the benefits of tracing for a large distributed system, it seems like a good time to add tracing to HDFS. HBase has added tracing using HTrace. I propose that the same can be done within HDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5274) Add Tracing to HDFS
[ https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HDFS-5274: Attachment: HDFS-5274-1.patch WIP path. This one has testing for the read and write paths started. Add Tracing to HDFS --- Key: HDFS-5274 URL: https://issues.apache.org/jira/browse/HDFS-5274 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Affects Versions: 2.1.1-beta Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HDFS-5274-0.patch, HDFS-5274-1.patch Since Google's Dapper paper has shown the benefits of tracing for a large distributed system, it seems like a good time to add tracing to HDFS. HBase has added tracing using HTrace. I propose that the same can be done within HDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782500#comment-13782500 ] Colin Patrick McCabe commented on HDFS-5276: bq. The counts from the threads, even though they are not running any more, should be included in stats count. Currently statistics object is passed from the client to the file system. This implementation may need incompatible changes. There's nothing incompatible about it. The objects used for thread-local storage are not the same object as the client is passing around. My point is that, if you keep adding objects whenever a thread is created, you also have to get rid of them when the thread is destroyed. Otherwise, you have a memory leak. It would be really simple to come up with a patch that does thread-local counters. I don't have time today, but maybe later this week. bq. Controlling issues such as cache alignments, synchronization from JVM are also essential to avoid contentions. Since the information is simply unavailable to Java programs, in my personal opinions the problem might be better addressed in the JVM, or even lower abstraction levels. The JVM has some problems, but this isn't one of them. Accessing the same memory from many different threads at once is inherently slow on modern multicore CPUs because of cache coherency issues. It's up to software designers to avoid this if they want the best performance. FileSystem.Statistics got performance issue on multi-thread read/write. --- Key: HDFS-5276 URL: https://issues.apache.org/jira/browse/HDFS-5276 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-alpha Reporter: Chengxiang Li Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5263) Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp
[ https://issues.apache.org/jira/browse/HDFS-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782554#comment-13782554 ] Hadoop QA commented on HDFS-5263: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606002/HDFS-5263.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5068//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5068//console This message is automatically generated. Delegation token is not created generateNodeDataHeader method of NamenodeJspHelper$NodeListJsp -- Key: HDFS-5263 URL: https://issues.apache.org/jira/browse/HDFS-5263 Project: Hadoop HDFS Issue Type: Bug Components: namenode, webhdfs Reporter: Vasu Mariyala Attachments: HDFS-5263.patch When Kerberos authentication is enabled, we are unable to browse to the data nodes using ( Name node web page -- Live Nodes -- Select any of the data nodes). The reason behind this is the delegation token is not provided as part of the url in the method (generateNodeDataHeader method of NodeListJsp) {code} String url = HttpConfig.getSchemePrefix() + d.getHostName() + : + d.getInfoPort() + /browseDirectory.jsp?namenodeInfoPort= + nnHttpPort + dir= + URLEncoder.encode(/, UTF-8) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, nnaddr); {code} But browsing the file system using name node web page -- Browse the file system - any directory is working fine as the redirectToRandomDataNode method of NamenodeJspHelper creates the delegation token {code} redirectLocation = HttpConfig.getSchemePrefix() + fqdn + : + redirectPort + /browseDirectory.jsp?namenodeInfoPort= + nn.getHttpAddress().getPort() + dir=/ + (tokenString == null ? : JspHelper.getDelegationTokenUrlParam(tokenString)) + JspHelper.getUrlParam(JspHelper.NAMENODE_ADDRESS, addr); {code} I will work on providing a patch for this issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5282) Support built in resolution of /~ to user's home directory: /user/{user.name}
Kevin Minder created HDFS-5282: -- Summary: Support built in resolution of /~ to user's home directory: /user/{user.name} Key: HDFS-5282 URL: https://issues.apache.org/jira/browse/HDFS-5282 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kevin Minder In many cases it would be very convenient for HDFS (and WebHDFS in particular) to support the Unix notion of /~ representing the user's home directory. This would allow for some scripts reusable because they would not contain hard coded home directories. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
[ https://issues.apache.org/jira/browse/HDFS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5171: - Status: Patch Available (was: Open) NFS should create input stream for a file and try to share it with multiple read requests - Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Haohui Mai Attachments: HDFS-5171.000.patch Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
[ https://issues.apache.org/jira/browse/HDFS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5171: - Attachment: HDFS-5171.000.patch NFS should create input stream for a file and try to share it with multiple read requests - Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Haohui Mai Attachments: HDFS-5171.000.patch Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
[ https://issues.apache.org/jira/browse/HDFS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782620#comment-13782620 ] Hadoop QA commented on HDFS-5171: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606038/HDFS-5171.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5069//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5069//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5069//console This message is automatically generated. NFS should create input stream for a file and try to share it with multiple read requests - Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Haohui Mai Attachments: HDFS-5171.000.patch Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5255) Distcp job fails with hsftp when https is enabled
[ https://issues.apache.org/jira/browse/HDFS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5255: Attachment: HDFS-5255.05.patch Distcp job fails with hsftp when https is enabled - Key: HDFS-5255 URL: https://issues.apache.org/jira/browse/HDFS-5255 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Yesha Vora Assignee: Arpit Agarwal Attachments: HDFS-5255.01.patch, HDFS-5255.02.patch, HDFS-5255.04.patch, HDFS-5255.05.patch Run Distcp job using hsftp when ssl is enabled. The job fails with java.net.SocketException: Unexpected end of file from server Error Running: hadoop distcp hsftp://localhost:50070/f1 hdfs://localhost:19000/f5 All the tasks fails with below error. 13/09/23 15:52:38 INFO mapreduce.Job: Task Id : attempt_1379976241507_0004_m_00_0, Status : FAILED Error: java.io.IOException: File copy failed: hsftp://localhost:50070/f1 -- hdfs://localhost:19000/f5 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hsftp://127.0.0.1:50070/f1 to hdfs://localhost:19000/f5 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258) ... 10 more Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:233) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:198) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:134) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:101) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more Caused by: java.io.IOException: HTTP_OK expected, received 500 at org.apache.hadoop.hdfs.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:383) at org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:119) at org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103) at org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:187) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:75) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:230) ... 16 more -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
[ https://issues.apache.org/jira/browse/HDFS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5171: - Attachment: HDFS-5171.001.patch NFS should create input stream for a file and try to share it with multiple read requests - Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Haohui Mai Attachments: HDFS-5171.000.patch, HDFS-5171.001.patch Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5171) NFS should create input stream for a file and try to share it with multiple read requests
[ https://issues.apache.org/jira/browse/HDFS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782646#comment-13782646 ] Hadoop QA commented on HDFS-5171: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606041/HDFS-5171.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5071//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5071//console This message is automatically generated. NFS should create input stream for a file and try to share it with multiple read requests - Key: HDFS-5171 URL: https://issues.apache.org/jira/browse/HDFS-5171 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li Assignee: Haohui Mai Attachments: HDFS-5171.000.patch, HDFS-5171.001.patch Currently, NFS creates an input steam for each read request and closes it after the request is served. With lots of read request, the overhead is significant. Like for write request, NFS should create input stream for a file and try to share it with multiple read requests. The stream can be closed if there is no read request for a certain amount of time (e.g., 10 sec). -- This message was sent by Atlassian JIRA (v6.1#6144)