[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default
[ https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110360#comment-14110360 ] Hadoop QA commented on HDFS-6773: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664301/HDFS-6773.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7763//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7763//console This message is automatically generated. MiniDFSCluster should skip edit log fsync by default Key: HDFS-6773 URL: https://issues.apache.org/jira/browse/HDFS-6773 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Stephen Chu Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch The mini cluster is unnecessarily running with durable edit logs. The following change cut runtime of a single test from ~30s to ~10s. {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code} The mini cluster should default to this behavior after identifying the few edit log tests that probably depend on durable logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6898: Attachment: HDFS-6898.04.patch Thanks for the review. Addressed all your feedback and added a stress test. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6898: Attachment: HDFS-6898.05.patch Fix a typo. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch, HDFS-6898.05.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Summary: Optimize HDFS Encrypted Transport performance (was: Optimize encryption support in DataTransfer Protocol with High performance) Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism, it supports three security strength: * high 3des or rc4 (126bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6921) Add LazyPersist flag to FileStatus
[ https://issues.apache.org/jira/browse/HDFS-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110457#comment-14110457 ] Vinayakumar B commented on HDFS-6921: - I feel, Since current patch did not modify the write(..) and readFields(..) method of Writable interface, FileStatus.java is still compatible. I agree that, by doing this FileStatus will not carry isLazyPersist via wire. But in HDFS this is carried through HdfsFileStatus' proto message which is by default backward compatible. So I feel this may not be a problem for the existing clients. Hence DisctCp also would work fine. Am I missing anything? Add LazyPersist flag to FileStatus -- Key: HDFS-6921 URL: https://issues.apache.org/jira/browse/HDFS-6921 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6921.01.patch, HDFS-6921.02.patch A new flag will be added to FileStatus to indicate that a file can be lazily persisted to disk i.e. trading reduced durability for better write performance. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6922) Add LazyPersist flag to INodeFile, save it in FsImage and edit logs
[ https://issues.apache.org/jira/browse/HDFS-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110469#comment-14110469 ] Vinayakumar B commented on HDFS-6922: - 1. Layout version should be changed in NameNodeLayoutVersion.java {code}+LAZY_PERSIST_FILES(-55, -52, Support for optional lazy persistence of ++ files with reduced durability guarantees, +true, PROTOBUF_FORMAT, EXTENDED_ACL);{code} 2. Better to use java naming conventions in BlockCollection.java {code}+ public boolean getLazyPersistFlag();{code} Add LazyPersist flag to INodeFile, save it in FsImage and edit logs --- Key: HDFS-6922 URL: https://issues.apache.org/jira/browse/HDFS-6922 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6922.01.patch Support for saving the LazyPersist flag in the FsImage and edit logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110474#comment-14110474 ] Hadoop QA commented on HDFS-6826: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664319/HDFS-6826v7.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7764//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7764//console This message is automatically generated. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.patch, HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6945) ExcessBlocks metric may not be decremented if there are no over replicated blocks
Akira AJISAKA created HDFS-6945: --- Summary: ExcessBlocks metric may not be decremented if there are no over replicated blocks Key: HDFS-6945 URL: https://issues.apache.org/jira/browse/HDFS-6945 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Akira AJISAKA I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, however, there are no over-replicated blocks (confirmed by fsck). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6832) Fix the usage of 'hdfs namenode' command
[ https://issues.apache.org/jira/browse/HDFS-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-6832: Target Version/s: 2.6.0 (was: 2.5.0) Fix the usage of 'hdfs namenode' command Key: HDFS-6832 URL: https://issues.apache.org/jira/browse/HDFS-6832 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Reporter: Akira AJISAKA Assignee: skrho Priority: Minor Labels: newbie Attachments: hdfs-6832.txt, hdfs-6832_001.txt {code} [root@trunk ~]# hdfs namenode -help Usage: java NameNode [-backup] | [-checkpoint] | [-format [-clusterid cid ] [-force] [-nonInteractive] ] | [-upgrade [-clusterid cid] [-renameReservedk-v pairs] ] | [-upgradeOnly [-clusterid cid] [-renameReservedk-v pairs] ] | [-rollback] | [-rollingUpgrade downgrade|rollback ] | [-finalize] | [-importCheckpoint] | [-initializeSharedEdits] | [-bootstrapStandby] | [-recover [ -force] ] | [-metadataVersion ] ] {code} There're some issues in the usage to be fixed. # Usage: java NameNode should be Usage: hdfs namenode # -rollingUpgrade started option should be added # The last ']' should be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110554#comment-14110554 ] Hadoop QA commented on HDFS-6898: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664334/HDFS-6898.04.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7765//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7765//console This message is automatically generated. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch, HDFS-6898.05.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110563#comment-14110563 ] Hadoop QA commented on HDFS-6898: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664335/HDFS-6898.05.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestPersistBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7766//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7766//console This message is automatically generated. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch, HDFS-6898.05.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110632#comment-14110632 ] Vinayakumar B commented on HDFS-6898: - Do you think this reservation should be done for the tmp files also? DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch, HDFS-6898.05.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110645#comment-14110645 ] Zesheng Wu commented on HDFS-6827: -- [~vinayrpet], I verified your patch of HADOOP-10251 on my cluster, it works as expected. Thanks. I will resolve this issue as 'duplicated'. Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes -- Key: HDFS-6827 URL: https://issues.apache.org/jira/browse/HDFS-6827 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.1 Reporter: Zesheng Wu Assignee: Zesheng Wu Priority: Critical Attachments: HDFS-6827.1.patch In our production cluster, we encounter a scenario like this: ANN crashed due to write journal timeout, and was restarted by the watchdog automatically, but after restarting both of the NNs are standby. Following is the logs of the scenario: # NN1 is down due to write journal timeout: {color:red}2014-08-03,23:02:02,219{color} INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG # ZKFC1 detected connection reset by peer {color:red}2014-08-03,23:02:02,560{color} ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: {color:red}Connection reset by peer{color} # NN1 wat restarted successfully by the watchdog: 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: xx:13201 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: IPC Server listener on 13200: starting 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean thread started! 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Registered DFSClientInformation MBean 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode up at: xx/xx:13200 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state # ZKFC1 retried the connection and considered NN1 was healthy {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS) # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the failover, as a result, both NNs were standby. The root cause of this bug is that NN is restarted too quickly and ZKFC health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes
[ https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6827: - Resolution: Duplicate Status: Resolved (was: Patch Available) Duplicate of HADOOP-10251. Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes -- Key: HDFS-6827 URL: https://issues.apache.org/jira/browse/HDFS-6827 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.1 Reporter: Zesheng Wu Assignee: Zesheng Wu Priority: Critical Attachments: HDFS-6827.1.patch In our production cluster, we encounter a scenario like this: ANN crashed due to write journal timeout, and was restarted by the watchdog automatically, but after restarting both of the NNs are standby. Following is the logs of the scenario: # NN1 is down due to write journal timeout: {color:red}2014-08-03,23:02:02,219{color} INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG # ZKFC1 detected connection reset by peer {color:red}2014-08-03,23:02:02,560{color} ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: {color:red}Connection reset by peer{color} # NN1 wat restarted successfully by the watchdog: 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: xx:13201 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: IPC Server listener on 13200: starting 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean thread started! 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Registered DFSClientInformation MBean 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode up at: xx/xx:13200 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state # ZKFC1 retried the connection and considered NN1 was healthy {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS) # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the failover, as a result, both NNs were standby. The root cause of this bug is that NN is restarted too quickly and ZKFC health monitor doesn't realize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Description: In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. was: In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism, it supports three security strength: * high 3des or rc4 (126bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Attachment: OptimizeHdfsEncryptedTransportperformance.pdf Attach a brief design for this optimization. Our goals are: * Support using CryptoCodec for encryption of HDFS transport. By default client and server will negotiate to use AES-CTR. * Compatibility: for old client or old server, it still works. Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0 Attachments: OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu moved HBASE-11824 to HDFS-6946: -- Key: HDFS-6946 (was: HBASE-11824) Project: Hadoop HDFS (was: HBase) TestBalancerWithSaslDataTransfer fails in trunk --- Key: HDFS-6946 URL: https://issues.apache.org/jira/browse/HDFS-6946 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor From build #1849 : {code} REGRESSION: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity Error Message: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. Stack Trace: java.util.concurrent.TimeoutException: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. at org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6776: Attachment: HDFS-6776.009.patch distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6776: Attachment: (was: HDFS-6776.009.patch) distcp from insecure cluster (source) to secure cluster (destination) doesn't work -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6776: Attachment: HDFS-6776.009.patch distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-6776: Summary: distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs (was: distcp from insecure cluster (source) to secure cluster (destination) doesn't work) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110758#comment-14110758 ] Yongjun Zhang commented on HDFS-6776: - Uploaded patch 009. This version passes real null delegation token for webhdfs, when an insecure cluster is asked for deleagation token. Hope this addresses, In addition, I included a config property which has to be turn on to support fallback. HI [~daryn] and [~wheat9], thanks a lot for your earlier comments, and hopefully this addressed them. Thanks. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Attachment: HDFS-6606.001.patch Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0 Attachments: HDFS-6606.001.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6606: - Fix Version/s: (was: 3.0.0) Target Version/s: 2.6.0 (was: 3.0.0) Affects Version/s: (was: 3.0.0) Status: Patch Available (was: In Progress) Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6606.001.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6938) Cleanup javac warnings in FSNamesystem.java
[ https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110798#comment-14110798 ] Charles Lamb commented on HDFS-6938: Since the diffs are only fixing unused imports and fields, no unit tests are necessary. Cleanup javac warnings in FSNamesystem.java --- Key: HDFS-6938 URL: https://issues.apache.org/jira/browse/HDFS-6938 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6938.001.patch Clean up some unused code/compiler warnings post fs-encryption merge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110805#comment-14110805 ] Yongjun Zhang commented on HDFS-6776: - BTW, I'd like to restrict the solution of the jira for webhdfs only, and I modified the title of this jira to reflect that. At least with the fix, we can enable distcping between secure and insecure cluster. As we know, right now it's broken. For other interface, like hftp in branch-2. I will file follow-up jira to resolve them. Thanks. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at
[jira] [Commented] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110927#comment-14110927 ] Juan Yu commented on HDFS-6908: --- [~jingzhao]] Thanks for reviewing patch and the discussion. incorrect snapshot directory diff generated by snapshot deletion Key: HDFS-6908 URL: https://issues.apache.org/jira/browse/HDFS-6908 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Juan Yu Assignee: Juan Yu Priority: Critical Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch, HDFS-6908.003.patch In the following scenario, delete snapshot could generate incorrect snapshot directory diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException. 1. create a directory and create a file under it 2. take a snapshot 3. create another file under that directory 4. take second snapshot 5. delete both files and the directory 6. delete second snapshot incorrect directory diff will be generated. Restart NN will throw NPE {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110933#comment-14110933 ] Ray Chiang commented on HDFS-6942: -- Both unit tests failures are unrelated and both tests work in my tree. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HDFS-6826: - Attachment: HDFS-6826v7.4.patch Test failure pass locally, scanning the test output does not shield anything related to this patch. Uploading new v7 patch with some refactoring, making the authz provider and abstract class with singleton pattern access. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, HDFS-6826v7.patch, HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110950#comment-14110950 ] Hadoop QA commented on HDFS-6776: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664394/HDFS-6776.009.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7767//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7767//console This message is automatically generated. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110949#comment-14110949 ] Hadoop QA commented on HDFS-6776: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664395/HDFS-6776.009.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7768//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7768//console This message is automatically generated. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at
[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default
[ https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110953#comment-14110953 ] Stephen Chu commented on HDFS-6773: --- The above two test failures aren't related to this patch. I ran them locally successfully to double-check. MiniDFSCluster should skip edit log fsync by default Key: HDFS-6773 URL: https://issues.apache.org/jira/browse/HDFS-6773 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Stephen Chu Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch The mini cluster is unnecessarily running with durable edit logs. The following change cut runtime of a single test from ~30s to ~10s. {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code} The mini cluster should default to this behavior after identifying the few edit log tests that probably depend on durable logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110970#comment-14110970 ] Yongjun Zhang commented on HDFS-6694: - Hi Arpit, thanks for your earlier review, would you please help committing it? Thanks. TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Blocker Fix For: 2.6.0 Attachments: HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.002.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111006#comment-14111006 ] Arpit Agarwal commented on HDFS-6694: - The repo is not open for commits. TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Blocker Fix For: 2.6.0 Attachments: HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.002.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111022#comment-14111022 ] Colin Patrick McCabe commented on HDFS-6912: bq. Colin Patrick McCabe: this is a machine without any swap. I took another look at the code, and it looks like we're creating a sparse file by using {{ftruncate}}. That, in turn, leads to the SIGBUS later when we try to access the offset in the file, and no memory is available to de-sparsify it. To remedy this, I added a call to {{posix_fallocate}}. This will lead to the space in memory being allocated at the time we create the shared file descriptor, rather than later when we read from it. Because you are out of memory, you'll still get a failure... but the failure will happen during allocation, not later, and it will be an exception which is handled cleanly, not a SIGBUS which shuts down the JVM. See if this patch works for you. bq. The commit seems to be one of yours, can you explain why this suggests /dev/shm? The configuration default is in {{/dev/shm}} because that is present on every modern Linux installation. We always want the shared memory segment FD to be in memory, rather than on disk. We have to read from this thing prior to every short-circuit read, so it needs to be fast. ramfs would have been better, but this would require special setup which most users don't want to do right now. Maybe this will change if we start recommending ramfs for HDFS-5851. Anyway, ramfs and tmpfs will behave similarly when swap is off, as in your case. HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage --- Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs on a disk is depleted. {code} --- T H R E A D --- Current thread (0x7eff387df800): JavaThread xxx daemon [_thread_in_vm, id=5880, stack(0x7eff28b93000,0x7eff28c94000)] siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x7eff3e51d000 {code} The entire backtrace of the JVM crash is {code} Stack: [0x7eff28b93000,0x7eff28c94000], sp=0x7eff28c90a10, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x88232c] Unsafe_GetLongVolatile+0x6c j sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4 j org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100 j org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102 j org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18 j org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151 j org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46 j org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230 j
[jira] [Updated] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Assignee: Colin Patrick McCabe Status: Patch Available (was: Open) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage --- Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Attachments: HDFS-6912.001.patch The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs on a disk is depleted. {code} --- T H R E A D --- Current thread (0x7eff387df800): JavaThread xxx daemon [_thread_in_vm, id=5880, stack(0x7eff28b93000,0x7eff28c94000)] siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x7eff3e51d000 {code} The entire backtrace of the JVM crash is {code} Stack: [0x7eff28b93000,0x7eff28c94000], sp=0x7eff28c90a10, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x88232c] Unsafe_GetLongVolatile+0x6c j sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4 j org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100 j org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102 j org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18 j org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151 j org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46 j org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230 j org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175 j org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87 j org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291 j org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83 j org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15 {code} This can be easily reproduced by starting the DataNode, filling up tmpfs (dd if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Attachment: HDFS-6912.001.patch HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage --- Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Attachments: HDFS-6912.001.patch The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs on a disk is depleted. {code} --- T H R E A D --- Current thread (0x7eff387df800): JavaThread xxx daemon [_thread_in_vm, id=5880, stack(0x7eff28b93000,0x7eff28c94000)] siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x7eff3e51d000 {code} The entire backtrace of the JVM crash is {code} Stack: [0x7eff28b93000,0x7eff28c94000], sp=0x7eff28c90a10, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x88232c] Unsafe_GetLongVolatile+0x6c j sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4 j org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100 j org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102 j org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18 j org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151 j org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46 j org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230 j org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175 j org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87 j org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291 j org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83 j org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15 {code} This can be easily reproduced by starting the DataNode, filling up tmpfs (dd if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Summary: SharedFileDescriptorFactory should not allocate sparse files (was: HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage) SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Attachments: HDFS-6912.001.patch The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs on a disk is depleted. {code} --- T H R E A D --- Current thread (0x7eff387df800): JavaThread xxx daemon [_thread_in_vm, id=5880, stack(0x7eff28b93000,0x7eff28c94000)] siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x7eff3e51d000 {code} The entire backtrace of the JVM crash is {code} Stack: [0x7eff28b93000,0x7eff28c94000], sp=0x7eff28c90a10, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x88232c] Unsafe_GetLongVolatile+0x6c j sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4 j org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100 j org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102 j org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18 j org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151 j org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46 j org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230 j org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175 j org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87 j org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291 j org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83 j org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15 {code} This can be easily reproduced by starting the DataNode, filling up tmpfs (dd if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6902) FileWriter should be closed in finally block in BlockReceiver#receiveBlock()
[ https://issues.apache.org/jira/browse/HDFS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111035#comment-14111035 ] Colin Patrick McCabe commented on HDFS-6902: +1. thanks FileWriter should be closed in finally block in BlockReceiver#receiveBlock() Key: HDFS-6902 URL: https://issues.apache.org/jira/browse/HDFS-6902 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: HDFS-6902.1.patch, HDFS-6902.2.patch Here is code starting from line 828: {code} try { FileWriter out = new FileWriter(restartMeta); // write out the current time. out.write(Long.toString(Time.now() + restartBudget)); out.flush(); out.close(); } catch (IOException ioe) { {code} If write() or flush() call throws IOException, out wouldn't be closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6851: --- Attachment: HDFS-6851.000.patch Posting .000 patch for a testpatch run. Flush EncryptionZoneWithId and add an id field to EncryptionZone Key: HDFS-6851 URL: https://issues.apache.org/jira/browse/HDFS-6851 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6851.000.patch EncryptionZoneWithId can be flushed by moving the id field up to EncryptionZone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Description: SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. was: The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the JVM when tmpfs on a disk is depleted. {code} --- T H R E A D --- Current thread (0x7eff387df800): JavaThread xxx daemon [_thread_in_vm, id=5880, stack(0x7eff28b93000,0x7eff28c94000)] siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x7eff3e51d000 {code} The entire backtrace of the JVM crash is {code} Stack: [0x7eff28b93000,0x7eff28c94000], sp=0x7eff28c90a10, free space=1014k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x88232c] Unsafe_GetLongVolatile+0x6c j sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8 j org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4 j org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38 j org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100 j org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102 j org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18 j org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151 j org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46 j org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230 j org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175 j org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87 j org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291 j org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83 j org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15 {code} This can be easily reproduced by starting the DataNode, filling up tmpfs (dd if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task. Priority: Minor (was: Major) SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6851: --- Target Version/s: 3.0.0 (was: fs-encryption (HADOOP-10150 and HDFS-6134)) Affects Version/s: (was: fs-encryption (HADOOP-10150 and HDFS-6134)) 3.0.0 Status: Patch Available (was: Open) Flush EncryptionZoneWithId and add an id field to EncryptionZone Key: HDFS-6851 URL: https://issues.apache.org/jira/browse/HDFS-6851 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6851.000.patch EncryptionZoneWithId can be flushed by moving the id field up to EncryptionZone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance
[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111039#comment-14111039 ] Hadoop QA commented on HDFS-6606: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664402/HDFS-6606.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7769//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7769//console This message is automatically generated. Optimize HDFS Encrypted Transport performance - Key: HDFS-6606 URL: https://issues.apache.org/jira/browse/HDFS-6606 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, security Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-6606.001.patch, OptimizeHdfsEncryptedTransportperformance.pdf In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it was a great work. It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports three security strength: * high 3des or rc4 (128bits) * medium des or rc4(56bits) * low rc4(40bits) 3des and rc4 are slow, only *tens of MB/s*, http://www.javamex.com/tutorials/cryptography/ciphers.shtml http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ I will give more detailed performance data in future. Absolutely it’s bottleneck and will vastly affect the end to end performance. AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode support for AES). This JIRA will use AES with AES-NI support as encryption algorithm for DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request
[ https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111058#comment-14111058 ] Hadoop QA commented on HDFS-6892: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664273/HDFS-6892.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7771//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7771//console This message is automatically generated. Add XDR packaging method for each NFS request - Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, HDFS-6892.003.patch This method can be used for unit tests. Most request implements this by overriding RequestWithHandle#serialize() method. However, some request classes missed it, e.g., COMMIT3Request, MKDIR3Request,READDIR3Request, READDIRPLUS3Request, RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request. RENAME3Reqeust is another example. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6947) Enhance HAR integration with encryption zones
Andrew Wang created HDFS-6947: - Summary: Enhance HAR integration with encryption zones Key: HDFS-6947 URL: https://issues.apache.org/jira/browse/HDFS-6947 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111064#comment-14111064 ] Hadoop QA commented on HDFS-6851: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664435/HDFS-6851.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7773//console This message is automatically generated. Flush EncryptionZoneWithId and add an id field to EncryptionZone Key: HDFS-6851 URL: https://issues.apache.org/jira/browse/HDFS-6851 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6851.000.patch EncryptionZoneWithId can be flushed by moving the id field up to EncryptionZone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6922) Add LazyPersist flag to INodeFile, save it in FsImage and edit logs
[ https://issues.apache.org/jira/browse/HDFS-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6922: Attachment: HDFS-6922.02.patch Thanks for reviewing Vinayakumar. Good catch on #1. I updated the patch. Not sure what you mean by the second comment. Which Java naming convention? Add LazyPersist flag to INodeFile, save it in FsImage and edit logs --- Key: HDFS-6922 URL: https://issues.apache.org/jira/browse/HDFS-6922 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6922.01.patch, HDFS-6922.02.patch Support for saving the LazyPersist flag in the FsImage and edit logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created
[ https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111072#comment-14111072 ] Arpit Agarwal commented on HDFS-6898: - Yes it may be helpful to have reservation for tmp files also. I'll file a separate Jira to look into it. DN must reserve space for a full block when an RBW block is created --- Key: HDFS-6898 URL: https://issues.apache.org/jira/browse/HDFS-6898 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.5.0 Reporter: Gopal V Assignee: Arpit Agarwal Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, HDFS-6898.04.patch, HDFS-6898.05.patch DN will successfully create two RBW blocks on the same volume even if the free space is sufficient for just one full block. One or both block writers may subsequently get a DiskOutOfSpace exception. This can be avoided by allocating space up front. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6865) Byte array native checksumming on client side (HDFS changes)
[ https://issues.apache.org/jira/browse/HDFS-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111085#comment-14111085 ] Todd Lipcon commented on HDFS-6865: --- Thanks for doing the diligence on the performance tests. Looks like this will be a good speedup across the board. A few comments: - In the FSOutputSummer constructor, aren't checksumSize and maxChunkSize now redundant with the DataChecksum object that's passed in? {{checksumSize}} should be the same as {{sum.getChecksumSize()}} and {{maxChunkSize}} should be the same as {{sum.getBytesPerChecksum()}}, no? - Similarly, in the FSOutputSummer class, it seems like the member variables of the same names are redundantr with the {{sum}} member variable. - Can you mark {{sum}} as {{final}} in FSOutputSummer? - Shouldn't BUFFER_NUM_CHUNKS be a multiple of 3, since we calculate three chunks worth in parallel in the native code? (worth a comment explaining the choice, too) {code} private int write1(byte b[], int off, int len) throws IOException { if(count==0 len=buf.length) { // local buffer is empty and user data has one chunk // checksum and output data {code} This comment is no longer accurate, right? The condition is now that the user data has provided data at least as long as our internal buffer. - {{writeChecksumChunk}} should probably be renamed to {{writeChecksumChunks}} and its javadoc get updated. - It's a little weird that you loop over {{writeChunk}} and pass a single chunk per call, though you actually have data ready for multiple chunks, and the API itself seems to be perfectly suitable to pass all of the chunks at once. Did you want to leave this as a later potential optimization? {code} writeChunk(b, off + i, Math.min(maxChunkSize, len - i), checksum, i / maxChunkSize * checksumSize, checksumSize); {code} This code might be a little easier to read if you made some local variables: {code} int rem = Math.min(maxChunkSize, len - i); int ckOffset = i / maxChunkSize * checksumSize; writeChunk(b, off + i, rem, checksum, ckOffset, checksumSize); {code} {code} /* Forces any buffered output bytes to be checksumed and written out to * the underlying output stream. If keep is true, then the state of * this object remains intact. {code} This comment is now inaccurate. If {{keep}} is true, then it retains only the last partial chunk worth of buffered data. - The {{setNumChunksToBuffer}} static thing is kind of sketchy. What if, instead, you implemented flush() in FSOutputSummer such that it always flushed all completed chunks? (and not any partial last chunk). Then you could make those tests call flush() before checkFile(), and not have to break any abstractions? Byte array native checksumming on client side (HDFS changes) Key: HDFS-6865 URL: https://issues.apache.org/jira/browse/HDFS-6865 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, performance Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6865.2.patch, HDFS-6865.3.patch, HDFS-6865.4.patch, HDFS-6865.5.patch, HDFS-6865.patch Refactor FSOutputSummer to buffer data and use the native checksum calculation functionality introduced in HADOOP-10975. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6923) Propagate LazyPersist flag to DNs via DataTransferProtocol
[ https://issues.apache.org/jira/browse/HDFS-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6923: Attachment: HDFS-6923.02.patch Rebased patch. Propagate LazyPersist flag to DNs via DataTransferProtocol -- Key: HDFS-6923 URL: https://issues.apache.org/jira/browse/HDFS-6923 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6923.01.patch, HDFS-6923.02.patch If the LazyPersist flag is set in the file properties, the DFSClient will propagate it to the DataNode via DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request
[ https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1402#comment-1402 ] Haohui Mai commented on HDFS-6892: -- Looks good to me. I think there are multiple places that the code can be simplified by merging the declarations and definitions: {code} +FileHandle handle = null; +handle = readHandle(xdr); {code} to {code} FileHandle handle = readHandle(xdr); {code} And {code} +FileHandle handle = null; +long cookie; +long cookieVerf; +int count; +handle = readHandle(xdr); cookie = xdr.readHyper(); cookieVerf = xdr.readHyper(); count = xdr.readInt(); {code} to {code} FileHandlehandle = readHandle(xdr); long cookie = xdr.readHyper(); long cookieVerf = xdr.readHyper(); int count = xdr.readInt(); {code} +1 once addressed. Add XDR packaging method for each NFS request - Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, HDFS-6892.003.patch This method can be used for unit tests. Most request implements this by overriding RequestWithHandle#serialize() method. However, some request classes missed it, e.g., COMMIT3Request, MKDIR3Request,READDIR3Request, READDIRPLUS3Request, RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request. RENAME3Reqeust is another example. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6925) DataNode should attempt to place replicas on transient storage first if lazyPersist flag is received
[ https://issues.apache.org/jira/browse/HDFS-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6925: Attachment: HDFS-6925.02.patch Thanks for reviewing [~jnp]! Updated patch to remove unnecessary edit to VolumeChoosingPolicy. The while loop in createRbw is to allow fallback to disk. We'll execute it at the most twice. I think it simplifies the failure handling. DataNode should attempt to place replicas on transient storage first if lazyPersist flag is received Key: HDFS-6925 URL: https://issues.apache.org/jira/browse/HDFS-6925 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Environment: If the LazyPersist flag is received via DataTransferProtocol then DN should attempt to place the files on RAM disk first, and failing that on regular disk. Support for lazily moving replicas from RAM disk to persistent storage will be added later. Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6925.01.patch, HDFS-6925.02.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1407#comment-1407 ] Todd Lipcon commented on HDFS-6912: --- Hey Colin. Did you verify that tmpfs supports fallocate going back to old versions? Looking at the kernel git history, it was only added in mid 2012 (e2d12e22c59ce714008aa5266d769f8568d74eac) corresponding to version 3.5. So, I'm not sure if it would be supported on el6 for example (maybe they backported it, maybe not). Doing a normal posix write() call to write some explicit zeros to the fd might be more portable and shouldn't really have any performance downside. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1418#comment-1418 ] Hadoop QA commented on HDFS-6912: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664433/HDFS-6912.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7772//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7772//console This message is automatically generated. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6948) DN rejects blocks if it has older UC block
Daryn Sharp created HDFS-6948: - Summary: DN rejects blocks if it has older UC block Key: HDFS-6948 URL: https://issues.apache.org/jira/browse/HDFS-6948 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp DNs appear to always reject blocks, even with newer genstamps, if it already has a UC copy in its tmp dir. {noformat}ReplicaAlreadyExistsException: Block XXX already exists in state TEMPORARY and thus cannot be created{noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions
[ https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495 ] Hadoop QA commented on HDFS-6826: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664424/HDFS-6826v7.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7770//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7770//console This message is automatically generated. Plugin interface to enable delegation of HDFS authorization assertions -- Key: HDFS-6826 URL: https://issues.apache.org/jira/browse/HDFS-6826 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.4.1 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, HDFS-6826v7.patch, HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, HDFSPluggableAuthorizationProposal.pdf When Hbase data, HiveMetaStore data or Search data is accessed via services (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce permissions on corresponding entities (databases, tables, views, columns, search collections, documents). It is desirable, when the data is accessed directly by users accessing the underlying data files (i.e. from a MapReduce job), that the permission of the data files map to the permissions of the corresponding data entity (i.e. table, column family or search collection). To enable this we need to have the necessary hooks in place in the NameNode to delegate authorization to an external system that can map HDFS files/directories to data entities and resolve their permissions based on the data entities permissions. I’ll be posting a design proposal in the next few days. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6928) 'hdfs put' command should accept lazyPersist flag for testing
[ https://issues.apache.org/jira/browse/HDFS-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6928: Attachment: HDFS-6928.02.patch Rebased patch. 'hdfs put' command should accept lazyPersist flag for testing - Key: HDFS-6928 URL: https://issues.apache.org/jira/browse/HDFS-6928 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6928.01.patch, HDFS-6928.02.patch Add a '-l' flag to 'hdfs put' which creates the file with the LAZY_PERSIST option. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request
[ https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111231#comment-14111231 ] Brandon Li commented on HDFS-6892: -- Uploaded a new patch to address Haohui's comments. Add XDR packaging method for each NFS request - Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, HDFS-6892.003.patch, HDFS-6892.004.patch This method can be used for unit tests. Most request implements this by overriding RequestWithHandle#serialize() method. However, some request classes missed it, e.g., COMMIT3Request, MKDIR3Request,READDIR3Request, READDIRPLUS3Request, RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request. RENAME3Reqeust is another example. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6892) Add XDR packaging method for each NFS request
[ https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6892: - Attachment: HDFS-6892.004.patch Add XDR packaging method for each NFS request - Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, HDFS-6892.003.patch, HDFS-6892.004.patch This method can be used for unit tests. Most request implements this by overriding RequestWithHandle#serialize() method. However, some request classes missed it, e.g., COMMIT3Request, MKDIR3Request,READDIR3Request, READDIRPLUS3Request, RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request. RENAME3Reqeust is another example. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6929) NN periodically unlinks lazy persist files with missing replicas from namespace
[ https://issues.apache.org/jira/browse/HDFS-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6929: Attachment: HDFS-6929.02.patch Updated patch to allow turning off the scrubber, document the option. NN periodically unlinks lazy persist files with missing replicas from namespace --- Key: HDFS-6929 URL: https://issues.apache.org/jira/browse/HDFS-6929 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: HDFS-6581 Attachments: HDFS-6929.01.patch, HDFS-6929.02.patch Occasional data loss is expected when using the lazy persist flag due to node restarts. The NN will optionally unlink lazy persist files from the namespace to avoid them from showing up as corrupt files. This behavior can be turned off with a global option. In the future this may be made a per-file option controllable by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6851: --- Attachment: (was: HDFS-6851.000.patch) Flush EncryptionZoneWithId and add an id field to EncryptionZone Key: HDFS-6851 URL: https://issues.apache.org/jira/browse/HDFS-6851 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb EncryptionZoneWithId can be flushed by moving the id field up to EncryptionZone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone
[ https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6851: --- Attachment: HDFS-6851.000.patch Redo the .000 patch. The last one didn't include the two deleted files. Flush EncryptionZoneWithId and add an id field to EncryptionZone Key: HDFS-6851 URL: https://issues.apache.org/jira/browse/HDFS-6851 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6851.000.patch EncryptionZoneWithId can be flushed by moving the id field up to EncryptionZone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111284#comment-14111284 ] Colin Patrick McCabe commented on HDFS-6912: bq. Hey Colin. Did you verify that tmpfs supports fallocate going back to old versions? Looking at the kernel git history, it was only added in mid 2012 (e2d12e22c59ce714008aa5266d769f8568d74eac) corresponding to version 3.5. So, I'm not sure if it would be supported on el6 for example (maybe they backported it, maybe not). I believe the glibc {{posix_fallocate}} wrapper falls back to using {{write()}} calls when {{fallocate}} itself is not supported by the kernel. There is some discussion here: https://lists.gnu.org/archive/html/bug-coreutils/2009-05/msg00207.html which talks about: bq. i.e. fall back to using write() as the glibc posix_fallocate() implementation does. But, I think it's simpler to just use {{write}} here. Any performance advantage to using {{ftruncate}} + {{fallocate}} is going to be extremely tiny (or nonexistent) since this file is only 8192 bytes. And {{write}} is much more portable. So here is a new version that does that. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request
[ https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111299#comment-14111299 ] Hadoop QA commented on HDFS-6892: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664461/HDFS-6892.004.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7774//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7774//console This message is automatically generated. Add XDR packaging method for each NFS request - Key: HDFS-6892 URL: https://issues.apache.org/jira/browse/HDFS-6892 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Affects Versions: 2.2.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, HDFS-6892.003.patch, HDFS-6892.004.patch This method can be used for unit tests. Most request implements this by overriding RequestWithHandle#serialize() method. However, some request classes missed it, e.g., COMMIT3Request, MKDIR3Request,READDIR3Request, READDIRPLUS3Request, RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request. RENAME3Reqeust is another example. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Attachment: HDFS-6912.002.patch SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover
[ https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6911: -- Attachment: h6911_20140827.patch h6911_20140827.patch: adds a new test for ScheduleSameBlock. Also adds another new test for ChooseExcess. Archival Storage: check if a block is already scheduled in Mover Key: HDFS-6911 URL: https://issues.apache.org/jira/browse/HDFS-6911 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6911_20140823.patch, h6911_20140827.patch Similar to balancer, Mover should remember all blocks already scheduled to move (movedBlocks). Then, check it before schedule a new block move. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111370#comment-14111370 ] Hadoop QA commented on HDFS-6912: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664475/HDFS-6912.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.io.nativeio.TestSharedFileDescriptorFactory {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7776//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7776//console This message is automatically generated. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.000.combo.patch Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Attachment: HDFS-6808.000.patch Update patch to add command line supports. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6808: Affects Version/s: (was: 2.4.1) 2.5.0 Status: Patch Available (was: Open) Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111455#comment-14111455 ] Stephen Chu commented on HDFS-6946: --- Similar to HDFS-5803, where TestBalancer#TIMEOUT was bumped from 20s to 40s. We can run TestBalancer between current trunk and the time when HDFS-5803 was fixed to see if there is a performance regression while taking into account test code changes. If there isn't a regression, perhaps we should bump up the timeout. TestBalancerWithSaslDataTransfer fails in trunk --- Key: HDFS-6946 URL: https://issues.apache.org/jira/browse/HDFS-6946 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Priority: Minor From build #1849 : {code} REGRESSION: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity Error Message: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. Stack Trace: java.util.concurrent.TimeoutException: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. at org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu reassigned HDFS-6946: - Assignee: Stephen Chu TestBalancerWithSaslDataTransfer fails in trunk --- Key: HDFS-6946 URL: https://issues.apache.org/jira/browse/HDFS-6946 Project: Hadoop HDFS Issue Type: Test Reporter: Ted Yu Assignee: Stephen Chu Priority: Minor From build #1849 : {code} REGRESSION: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity Error Message: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. Stack Trace: java.util.concurrent.TimeoutException: Cluster failed to reached expected values of totalSpace (current: 750, expected: 750), or usedSpace (current: 140, expected: 150), in more than 4 msec. at org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6727: Target Version/s: 3.0.0, 2.6.0 (was: 2.6.0) Affects Version/s: 2.5.0 Status: Patch Available (was: Open) Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.4.1, 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6727: Attachment: HDFS-6727.combo.patch Update a combo patch against trunk. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6894) Add XDR parser method for each NFS response
[ https://issues.apache.org/jira/browse/HDFS-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6894: - Description: This can be an abstract method in NFS3Response to force the subclasses to implement. Add XDR parser method for each NFS response --- Key: HDFS-6894 URL: https://issues.apache.org/jira/browse/HDFS-6894 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li This can be an abstract method in NFS3Response to force the subclasses to implement. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6894) Add XDR parser method for each NFS response
[ https://issues.apache.org/jira/browse/HDFS-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-6894: - Environment: (was: This can be an abstract method in NFS3Response to force the subclasses to implement.) Add XDR parser method for each NFS response --- Key: HDFS-6894 URL: https://issues.apache.org/jira/browse/HDFS-6894 Project: Hadoop HDFS Issue Type: Sub-task Components: nfs Reporter: Brandon Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6891) Follow-on work for transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6891: -- Component/s: encryption Follow-on work for transparent data at rest encryption -- Key: HDFS-6891 URL: https://issues.apache.org/jira/browse/HDFS-6891 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Charles Lamb This is an umbrella JIRA to track remaining subtasks from HDFS-6134. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6912: --- Attachment: HDFS-6912.003.patch The unit test was relying on the file position being 0. I don't think anything else relies on this (we use mmap to access this) but in v3 of the patch, I made it restore the file position to 0 just for simplicity. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, HDFS-6912.003.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6950) Add Additional unit tests for HDFS-6581
Xiaoyu Yao created HDFS-6950: Summary: Add Additional unit tests for HDFS-6581 Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Bug Reporter: Xiaoyu Yao Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Issue Type: Sub-task (was: Bug) Parent: HDFS-6581 Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Assignee: Xiaoyu Yao Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Bug Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111530#comment-14111530 ] Yongjun Zhang commented on HDFS-6776: - I'd like to emphasize that with the latest patch of using null token instead of NullToken exception, user has to apply the same patch to both source and target cluster. With the prior revision that Alejandro commented, that combines NullToken and message parsing,, user just need to patch the secure cluster. Thanks. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at
[jira] [Commented] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover
[ https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111551#comment-14111551 ] Tsz Wo Nicholas Sze commented on HDFS-6911: --- ... , maybe a more efficient way here is to track the inode id, ... I think it is a good idea. Let's do this improvement separately since it cannot reuse the Balancer code. Archival Storage: check if a block is already scheduled in Mover Key: HDFS-6911 URL: https://issues.apache.org/jira/browse/HDFS-6911 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6911_20140823.patch, h6911_20140827.patch Similar to balancer, Mover should remember all blocks already scheduled to move (movedBlocks). Then, check it before schedule a new block move. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6929) NN periodically unlinks lazy persist files with missing replicas from namespace
[ https://issues.apache.org/jira/browse/HDFS-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111564#comment-14111564 ] Jitendra Nath Pandey commented on HDFS-6929: +1 NN periodically unlinks lazy persist files with missing replicas from namespace --- Key: HDFS-6929 URL: https://issues.apache.org/jira/browse/HDFS-6929 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: HDFS-6581 Attachments: HDFS-6929.01.patch, HDFS-6929.02.patch Occasional data loss is expected when using the lazy persist flag due to node restarts. The NN will optionally unlink lazy persist files from the namespace to avoid them from showing up as corrupt files. This behavior can be turned off with a global option. In the future this may be made a per-file option controllable by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files
[ https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111571#comment-14111571 ] Hadoop QA commented on HDFS-6912: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664509/HDFS-6912.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverControllerStress {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7779//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7779//console This message is automatically generated. SharedFileDescriptorFactory should not allocate sparse files Key: HDFS-6912 URL: https://issues.apache.org/jira/browse/HDFS-6912 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.5.0 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, HDFS-6912.003.patch SharedFileDescriptor factory should not allocate sparse files. Sparse files can lead to a SIGBUS later in the short-circuit reader when we try to read from the sparse file and memory is not available. Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse file, since the JVM uses MAP_NORESERVE in mmap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111578#comment-14111578 ] Alejandro Abdelnur commented on HDFS-6776: -- IMO, enabling to work with an unpatched cluster (via message parsing) is a desirable capability as it does not require users to upgrade older clusters if they are just reading data from them. distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs -- Key: HDFS-6776 URL: https://issues.apache.org/jira/browse/HDFS-6776 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0, 2.5.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, HDFS-6776.009.patch, dummy-token-proxy.js Issuing distcp command at the secure cluster side, trying to copy stuff from insecure cluster to secure cluster, and see the following problem: {code} hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp hdfs://sure-cluster:8020/tmp/tmptgt 14/07/30 20:06:19 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[webhdfs://insecure-cluster:port/tmp], targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true} 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at secure-clister:8032 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438) at
[jira] [Commented] (HDFS-6920) Archival Storage: check the storage type of delNodeHintStorage when deleting a replica
[ https://issues.apache.org/jira/browse/HDFS-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111612#comment-14111612 ] Jing Zhao commented on HDFS-6920: - The patch looks good to me. +1 Can we also have a unit test for this? Archival Storage: check the storage type of delNodeHintStorage when deleting a replica -- Key: HDFS-6920 URL: https://issues.apache.org/jira/browse/HDFS-6920 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6920_20140823.patch in BlockManager.chooseExcessReplicates, it does not check the storage type of delNodeHintStorage. Therefore, delNodeHintStorage could possibly be chosen even if its storage type is not an excess storage type. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111634#comment-14111634 ] Colin Patrick McCabe commented on HDFS-6634: The new design doc looks really good. {code} +@InterfaceAudience.Public +@InterfaceStability.Evolving +public class MissingEventsException extends Exception { {code} Since this is part of the public API, we should have a friendly toString method that prints something like inotify was unable to locate some events. We expected txid X, but were only able to read up to txid Y Re: INVALID_TXID. I think that we don't need to add this to the proto file as I suggested earlier. The only way to add it would be as an enum value, which seems like kind of a hack. So it's fine as-is. Re: the QuorumJournalManager changes: [~tlipcon], [~james.thomas], [~andrew.wang] and I talked offline about this. The existing logic in QJM to prevent reading uncommitted edits should suffice, so we shouldn't need to add the ability to fetch the writer epoch via an RPC. There should never be divergent QJM edit logs... as Todd pointed out, each QJM edit log should be up-to-date, or a prefix of an up-to-date log. We should do something to avoid rescanning those in-progress edit logs to find the final txid over and over on the JournalNodes, though. Overall, great work, James... I think this is almost ready to go. inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover
[ https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111637#comment-14111637 ] Jing Zhao commented on HDFS-6911: - +1 for the latest patch. Archival Storage: check if a block is already scheduled in Mover Key: HDFS-6911 URL: https://issues.apache.org/jira/browse/HDFS-6911 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h6911_20140823.patch, h6911_20140827.patch Similar to balancer, Mover should remember all blocks already scheduled to move (movedBlocks). Then, check it before schedule a new block move. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.
[ https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111653#comment-14111653 ] Hadoop QA commented on HDFS-6808: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664492/HDFS-6808.000.combo.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build///testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build///artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build///console This message is automatically generated. Add command line option to ask DataNode reload configuration. - Key: HDFS-6808 URL: https://issues.apache.org/jira/browse/HDFS-6808 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch The workflow of dynamically changing data volumes on DataNode is # Users manually changed {{dfs.datanode.data.dir}} in the configuration file # User use command line to notify DN to reload configuration and updates its volumes. This work adds command line support to notify DN to reload configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode
[ https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111655#comment-14111655 ] Sanjay Radia commented on HDFS-6469: My thoughts: * I do believe that Paxos based NN would give faster failover than what NN HA offers today (30sec to a few minutes but typically no more than 1 minute or two). So this is clearly a benefit of CNode though I have not heard a single customer complain about the failover time so far. * The proposed solution does not increase the write throughput. * The parallel reads advantage of CNode can be achieved in the current HA setup with some work (this is discussed above). If this is the main benefit than I rather pursue enhancing the NN standby to support reads. Further there is existing on going work to improve the locking in the NN. * I share Todd's view that ZK is not a usable reference implementation for Paxos. One really needs a paxos library that can be plugged in rather than an external server-based solution like ZK. So at this stage I am having a hard time seeing the benefits to justify the costs of adding this complexity. I do however understand the overhead that Wandisco faces in integrating their solution with HDFS each time HDFS is modified. Would a few plugin interfaces make it easier? I would be more than happy to support adding such plugins if they would help. Coordinated replication of the namespace using ConsensusNode Key: HDFS-6469 URL: https://issues.apache.org/jira/browse/HDFS-6469 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: CNodeDesign.pdf This is a proposal to introduce ConsensusNode - an evolution of the NameNode, which enables replication of the namespace on multiple nodes of an HDFS cluster by means of a Coordination Engine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111707#comment-14111707 ] Yi Liu commented on HDFS-6705: -- if the super user is the owner, then it can't access the file? {quote} It is settable by any user which has hdfs access to that file. It can only be set and never removed. {quote} Then any user who has hdfs access can easily prevent HDFS admin to access file and the admin can't access that file any more. Could we find a better way? Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6705.001.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu updated HDFS-6951: -- Attachment: HDFS-6951-testrepo.patch Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Fix For: 3.0.0 Attachments: HDFS-6951-testrepo.patch Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. To reproduce: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
Stephen Chu created HDFS-6951: - Summary: Saving namespace and restarting NameNode will remove existing encryption zones Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Fix For: 3.0.0 Attachments: HDFS-6951-testrepo.patch Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. To reproduce: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu updated HDFS-6951: -- Description: Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. was: Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. To reproduce: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Fix For: 3.0.0 Attachments: HDFS-6951-testrepo.patch Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb reassigned HDFS-6951: -- Assignee: Charles Lamb Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Fix For: 3.0.0 Attachments: HDFS-6951-testrepo.patch Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
[ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111766#comment-14111766 ] Hadoop QA commented on HDFS-6727: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664501/HDFS-6727.combo.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZones org.apache.hadoop.hdfs.server.datanode.TestBPOfferService org.apache.hadoop.security.TestRefreshUserMappings org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestCrcCorruption org.apache.hadoop.hdfs.TestDataTransferKeepalive The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.viewfs.TestViewFsAtHdfsRoot {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7778//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7778//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7778//console This message is automatically generated. Refresh data volumes on DataNode based on configuration changes --- Key: HDFS-6727 URL: https://issues.apache.org/jira/browse/HDFS-6727 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 2.5.0, 2.4.1 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: datanode Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch HDFS-1362 requires DataNode to reload configuration file during the runtime, so that DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework introduced by HADOOP-7001 to enable DN to reconfigure at runtime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6944) Archival Storage: add a test framework for testing different migration scenarios
[ https://issues.apache.org/jira/browse/HDFS-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6944: Attachment: HDFS-6944.001.patch Update the patch: # Now for a pendingMove, a target may be scheduled to the same DataNode of an existing replica. This is because we currently use MovedBlocks.Locations#isLocatedOn, which compares StorageGroup instances. Then when we do the data migration the DN actually may complain that it already has a replica and fail the migration. A fix here can be doing comparison based on StorageGroup#getDataNodeInfo(). # Currently the Mover cannot terminate since Mover#run always returns IN_PROGRESS. The patch adds code to wait for the existing migration finish, and also adds a simple termination condition. Archival Storage: add a test framework for testing different migration scenarios Key: HDFS-6944 URL: https://issues.apache.org/jira/browse/HDFS-6944 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-6944.000.patch, HDFS-6944.001.patch This jira plans to add a testing framework for testing different scenarios of data migration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6944) Archival Storage: add a test framework for testing different migration scenarios
[ https://issues.apache.org/jira/browse/HDFS-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111795#comment-14111795 ] Jing Zhao commented on HDFS-6944: - The patch depends on HDFS-6899 (to be merged from trunk) and HDFS-6911. Archival Storage: add a test framework for testing different migration scenarios Key: HDFS-6944 URL: https://issues.apache.org/jira/browse/HDFS-6944 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer, namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-6944.000.patch, HDFS-6944.001.patch This jira plans to add a testing framework for testing different scenarios of data migration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6779) hdfs version subcommand is missing
[ https://issues.apache.org/jira/browse/HDFS-6779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-6779: --- Summary: hdfs version subcommand is missing (was: [post-HADOOP-9902] hdfs version subcommand is missing) hdfs version subcommand is missing -- Key: HDFS-6779 URL: https://issues.apache.org/jira/browse/HDFS-6779 Project: Hadoop HDFS Issue Type: Improvement Components: scripts Reporter: Allen Wittenauer Labels: scripts 'hdfs version' is missing -- This message was sent by Atlassian JIRA (v6.2#6252)