date:20140826


[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110360#comment-14110360
 ] 

Hadoop QA commented on HDFS-6773:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664301/HDFS-6773.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7763//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7763//console

This message is automatically generated.

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6898) DN must reserve space for a full block when an RBW block is created


 [ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6898:


Attachment: HDFS-6898.04.patch

Thanks for the review. Addressed all your feedback and added a stress test.

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6898) DN must reserve space for a full block when an RBW block is created


 [ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6898:


Attachment: HDFS-6898.05.patch

Fix a typo.

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch, HDFS-6898.05.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-

Summary: Optimize HDFS Encrypted Transport performance  (was: Optimize 
encryption support in DataTransfer Protocol with High performance)

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism,  it supports three security 
 strength:
 * high  3des   or rc4 (126bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6921) Add LazyPersist flag to FileStatus

2014-08-26 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110457#comment-14110457
 ] 

Vinayakumar B commented on HDFS-6921:
-

I feel, Since current patch did not modify the write(..) and readFields(..) 
method of Writable interface, FileStatus.java is still compatible. I agree 
that, by doing this FileStatus will not carry isLazyPersist via wire. But in 
HDFS this is carried through HdfsFileStatus' proto message which is by default 
backward compatible.
 So I feel this may not be a problem for the existing clients. Hence DisctCp 
also would work fine. 

Am I missing anything?

 Add LazyPersist flag to FileStatus
 --

 Key: HDFS-6921
 URL: https://issues.apache.org/jira/browse/HDFS-6921
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6921.01.patch, HDFS-6921.02.patch


 A new flag will be added to FileStatus to indicate that a file can be lazily 
 persisted to disk i.e. trading reduced durability for better write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6922) Add LazyPersist flag to INodeFile, save it in FsImage and edit logs

2014-08-26 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110469#comment-14110469
 ] 

Vinayakumar B commented on HDFS-6922:
-

1. Layout version should be changed in NameNodeLayoutVersion.java

{code}+LAZY_PERSIST_FILES(-55, -52, Support for optional lazy persistence 
of 
++  files with reduced durability guarantees,
+true, PROTOBUF_FORMAT, EXTENDED_ACL);{code}
 

2. Better to use java naming conventions in BlockCollection.java
{code}+  public boolean getLazyPersistFlag();{code}

 Add LazyPersist flag to INodeFile, save it in FsImage and edit logs
 ---

 Key: HDFS-6922
 URL: https://issues.apache.org/jira/browse/HDFS-6922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6922.01.patch


 Support for saving the LazyPersist flag in the FsImage and edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions


[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110474#comment-14110474
 ] 

Hadoop QA commented on HDFS-6826:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664319/HDFS-6826v7.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7764//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7764//console

This message is automatically generated.

 Plugin interface to enable delegation of HDFS authorization assertions
 --

 Key: HDFS-6826
 URL: https://issues.apache.org/jira/browse/HDFS-6826
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.4.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
 HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, 
 HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, 
 HDFS-6826v7.patch, HDFS-6826v8.patch, 
 HDFSPluggableAuthorizationProposal-v2.pdf, 
 HDFSPluggableAuthorizationProposal.pdf


 When Hbase data, HiveMetaStore data or Search data is accessed via services 
 (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
 permissions on corresponding entities (databases, tables, views, columns, 
 search collections, documents). It is desirable, when the data is accessed 
 directly by users accessing the underlying data files (i.e. from a MapReduce 
 job), that the permission of the data files map to the permissions of the 
 corresponding data entity (i.e. table, column family or search collection).
 To enable this we need to have the necessary hooks in place in the NameNode 
 to delegate authorization to an external system that can map HDFS 
 files/directories to data entities and resolve their permissions based on the 
 data entities permissions.
 I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6945) ExcessBlocks metric may not be decremented if there are no over replicated blocks

2014-08-26 Thread Akira AJISAKA (JIRA)

Akira AJISAKA created HDFS-6945:
---

 Summary: ExcessBlocks metric may not be decremented if there are 
no over replicated blocks
 Key: HDFS-6945
 URL: https://issues.apache.org/jira/browse/HDFS-6945
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Akira AJISAKA


I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
however, there are no over-replicated blocks (confirmed by fsck).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6832) Fix the usage of 'hdfs namenode' command

2014-08-26 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6832:


Target Version/s: 2.6.0  (was: 2.5.0)

 Fix the usage of 'hdfs namenode' command
 

 Key: HDFS-6832
 URL: https://issues.apache.org/jira/browse/HDFS-6832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1
Reporter: Akira AJISAKA
Assignee: skrho
Priority: Minor
  Labels: newbie
 Attachments: hdfs-6832.txt, hdfs-6832_001.txt


 {code}
 [root@trunk ~]# hdfs namenode -help
 Usage: java NameNode [-backup] | 
   [-checkpoint] | 
   [-format [-clusterid cid ] [-force] [-nonInteractive] ] | 
   [-upgrade [-clusterid cid] [-renameReservedk-v pairs] ] | 
   [-upgradeOnly [-clusterid cid] [-renameReservedk-v pairs] ] | 
   [-rollback] | 
   [-rollingUpgrade downgrade|rollback ] | 
   [-finalize] | 
   [-importCheckpoint] | 
   [-initializeSharedEdits] | 
   [-bootstrapStandby] | 
   [-recover [ -force] ] | 
   [-metadataVersion ]  ]
 {code}
 There're some issues in the usage to be fixed.
 # Usage: java NameNode should be Usage: hdfs namenode
 # -rollingUpgrade started option should be added
 # The last ']' should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created


[ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110554#comment-14110554
 ] 

Hadoop QA commented on HDFS-6898:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664334/HDFS-6898.04.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7765//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7765//console

This message is automatically generated.

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch, HDFS-6898.05.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created


[ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110563#comment-14110563
 ] 

Hadoop QA commented on HDFS-6898:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664335/HDFS-6898.05.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestPersistBlocks

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7766//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7766//console

This message is automatically generated.

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch, HDFS-6898.05.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created

2014-08-26 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110632#comment-14110632
 ] 

Vinayakumar B commented on HDFS-6898:
-

Do you think this reservation should be done for the tmp files also?

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch, HDFS-6898.05.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes

2014-08-26 Thread Zesheng Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110645#comment-14110645
 ] 

Zesheng Wu commented on HDFS-6827:
--

[~vinayrpet], I verified your patch of HADOOP-10251 on my cluster, it works as 
expected. Thanks.
I will resolve this issue as 'duplicated'.

 Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the 
 target's status changing sometimes
 --

 Key: HDFS-6827
 URL: https://issues.apache.org/jira/browse/HDFS-6827
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.1
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Critical
 Attachments: HDFS-6827.1.patch


 In our production cluster, we encounter a scenario like this: ANN crashed due 
 to write journal timeout, and was restarted by the watchdog automatically, 
 but after restarting both of the NNs are standby.
 Following is the logs of the scenario:
 # NN1 is down due to write journal timeout:
 {color:red}2014-08-03,23:02:02,219{color} INFO 
 org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG
 # ZKFC1 detected connection reset by peer
 {color:red}2014-08-03,23:02:02,560{color} ERROR 
 org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
 as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: 
 {color:red}Connection reset by peer{color}
 # NN1 wat restarted successfully by the watchdog:
 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 Web-server up at: xx:13201
 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server 
 Responder: starting
 {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: 
 IPC Server listener on 13200: starting
 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean 
 thread started!
 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 Registered DFSClientInformation MBean
 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 NameNode up at: xx/xx:13200
 2014-08-03,23:02:08,744 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
 required for standby state
 # ZKFC1 retried the connection and considered NN1 was healthy
 {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: 
 Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 
 SECONDS)
 # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the 
 failover, as a result, both NNs were standby.
 The root cause of this bug is that NN is restarted too quickly and ZKFC 
 health monitor doesn't realize that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6827) Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the target's status changing sometimes

2014-08-26 Thread Zesheng Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6827:
-

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Duplicate of HADOOP-10251.

 Both NameNodes stuck in STANDBY state due to HealthMonitor not aware of the 
 target's status changing sometimes
 --

 Key: HDFS-6827
 URL: https://issues.apache.org/jira/browse/HDFS-6827
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.1
Reporter: Zesheng Wu
Assignee: Zesheng Wu
Priority: Critical
 Attachments: HDFS-6827.1.patch


 In our production cluster, we encounter a scenario like this: ANN crashed due 
 to write journal timeout, and was restarted by the watchdog automatically, 
 but after restarting both of the NNs are standby.
 Following is the logs of the scenario:
 # NN1 is down due to write journal timeout:
 {color:red}2014-08-03,23:02:02,219{color} INFO 
 org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG
 # ZKFC1 detected connection reset by peer
 {color:red}2014-08-03,23:02:02,560{color} ERROR 
 org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
 as:xx@xx.HADOOP (auth:KERBEROS) cause:java.io.IOException: 
 {color:red}Connection reset by peer{color}
 # NN1 wat restarted successfully by the watchdog:
 2014-08-03,23:02:07,884 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 Web-server up at: xx:13201
 2014-08-03,23:02:07,884 INFO org.apache.hadoop.ipc.Server: IPC Server 
 Responder: starting
 {color:red}2014-08-03,23:02:07,884{color} INFO org.apache.hadoop.ipc.Server: 
 IPC Server listener on 13200: starting
 2014-08-03,23:02:08,742 INFO org.apache.hadoop.ipc.Server: RPC server clean 
 thread started!
 2014-08-03,23:02:08,743 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 Registered DFSClientInformation MBean
 2014-08-03,23:02:08,744 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 NameNode up at: xx/xx:13200
 2014-08-03,23:02:08,744 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
 required for standby state
 # ZKFC1 retried the connection and considered NN1 was healthy
 {color:red}2014-08-03,23:02:08,292{color} INFO org.apache.hadoop.ipc.Client: 
 Retrying connect to server: xx/xx:13200. Already tried 0 time(s); retry 
 policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 
 SECONDS)
 # ZKFC1 still considered NN1 as a healthy Active NN, and didn't trigger the 
 failover, as a result, both NNs were standby.
 The root cause of this bug is that NN is restarted too quickly and ZKFC 
 health monitor doesn't realize that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-

Description: 
In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it 
was a great work.
It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
three security strength:
* high  3des   or rc4 (128bits)
* medium des or rc4(56bits)
* low   rc4(40bits)

3des and rc4 are slow, only *tens of MB/s*, 
http://www.javamex.com/tutorials/cryptography/ciphers.shtml
http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/

I will give more detailed performance data in future. Absolutely it’s 
bottleneck and will vastly affect the end to end performance. 

AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s 
more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it 
won’t be the bottleneck any more, AES and CryptoCodec work is supported in 
HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode 
support for AES). 

This JIRA will use AES with AES-NI support as encryption algorithm for 
DataTransferProtocol.


  was:
In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, it 
was a great work.
It utilizes SASL {{Digest-MD5}} mechanism,  it supports three security strength:
* high  3des   or rc4 (126bits)
* medium des or rc4(56bits)
* low   rc4(40bits)

3des and rc4 are slow, only *tens of MB/s*, 
http://www.javamex.com/tutorials/cryptography/ciphers.shtml
http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/

I will give more detailed performance data in future. Absolutely it’s 
bottleneck and will vastly affect the end to end performance. 

AES(Advanced Encryption Standard) is recommended as a replacement of DES, it’s 
more secure; with AES-NI support, the throughput can reach nearly *2GB/s*, it 
won’t be the bottleneck any more, AES and CryptoCodec work is supported in 
HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add a new mode 
support for AES). 

This JIRA will use AES with AES-NI support as encryption algorithm for 
DataTransferProtocol.



 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-

Attachment: OptimizeHdfsEncryptedTransportperformance.pdf

Attach a brief design for this optimization. 
Our goals are:
* Support using CryptoCodec for encryption of HDFS transport. By default client 
and server will negotiate to use AES-CTR.
* Compatibility: for old client or old server, it still works.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Moved] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk

2014-08-26 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu moved HBASE-11824 to HDFS-6946:
--

Key: HDFS-6946  (was: HBASE-11824)
Project: Hadoop HDFS  (was: HBase)

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work


 [ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6776:


Attachment: HDFS-6776.009.patch

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at

[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work


 [ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6776:


Attachment: (was: HDFS-6776.009.patch)

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at

[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


 [ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6776:


Attachment: HDFS-6776.009.patch

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at

[jira] [Updated] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


 [ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-6776:


Summary: distcp from insecure cluster (source) to secure cluster 
(destination) doesn't work via webhdfs  (was: distcp from insecure cluster 
(source) to secure cluster (destination) doesn't work)

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110758#comment-14110758
 ] 

Yongjun Zhang commented on HDFS-6776:
-

Uploaded patch 009. This version passes real null delegation token for 
webhdfs, when an insecure cluster is asked for deleagation token. Hope this 
addresses, In addition, I included a config property which has to be turn on to 
support fallback.

HI [~daryn] and [~wheat9], thanks a lot for your earlier comments, and 
hopefully this addressed them. Thanks.




 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-

Attachment: HDFS-6606.001.patch

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: HDFS-6606.001.patch, 
 OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-

Fix Version/s: (was: 3.0.0)
 Target Version/s: 2.6.0  (was: 3.0.0)
Affects Version/s: (was: 3.0.0)
   Status: Patch Available  (was: In Progress)

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, 
 OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6938) Cleanup javac warnings in FSNamesystem.java


[ 
https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110798#comment-14110798
 ] 

Charles Lamb commented on HDFS-6938:


Since the diffs are only fixing unused imports and fields, no unit tests are 
necessary.


 Cleanup javac warnings in FSNamesystem.java
 ---

 Key: HDFS-6938
 URL: https://issues.apache.org/jira/browse/HDFS-6938
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6938.001.patch


 Clean up some unused code/compiler warnings post fs-encryption merge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110805#comment-14110805
 ] 

Yongjun Zhang commented on HDFS-6776:
-

BTW, I'd like to restrict the solution of the jira for webhdfs only, and I 
modified the title of this jira to reflect that. At least with the fix, we can 
enable distcping between secure and insecure cluster. As we know, right now 
it's broken. For other interface, like hftp in branch-2. I will file follow-up 
jira to resolve them. Thanks.




 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at

[jira] [Commented] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion

2014-08-26 Thread Juan Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110927#comment-14110927
 ] 

Juan Yu commented on HDFS-6908:
---

[~jingzhao]] Thanks for reviewing patch and the discussion.

 incorrect snapshot directory diff generated by snapshot deletion
 

 Key: HDFS-6908
 URL: https://issues.apache.org/jira/browse/HDFS-6908
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Critical
 Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch, 
 HDFS-6908.003.patch


 In the following scenario, delete snapshot could generate incorrect snapshot 
 directory diff and corrupted fsimage, if you restart NN after that, you will 
 get NullPointerException.
 1. create a directory and create a file under it
 2. take a snapshot
 3. create another file under that directory
 4. take second snapshot
 5. delete both files and the directory
 6. delete second snapshot
 incorrect directory diff will be generated.
 Restart NN will throw NPE
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246)
   at 
 org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265)
   at 
 org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328)
   at 
 org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-08-26 Thread Ray Chiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110933#comment-14110933
 ] 

Ray Chiang commented on HDFS-6942:
--

Both unit tests failures are unrelated and both tests work in my tree.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-08-26 Thread Alejandro Abdelnur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-6826:
-

Attachment: HDFS-6826v7.4.patch

Test failure pass locally, scanning the test output does not shield anything 
related to this patch.

Uploading new v7 patch with some refactoring, making the authz provider and 
abstract class with singleton pattern access.


 Plugin interface to enable delegation of HDFS authorization assertions
 --

 Key: HDFS-6826
 URL: https://issues.apache.org/jira/browse/HDFS-6826
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.4.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
 HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, 
 HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, 
 HDFS-6826v7.4.patch, HDFS-6826v7.patch, HDFS-6826v8.patch, 
 HDFSPluggableAuthorizationProposal-v2.pdf, 
 HDFSPluggableAuthorizationProposal.pdf


 When Hbase data, HiveMetaStore data or Search data is accessed via services 
 (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
 permissions on corresponding entities (databases, tables, views, columns, 
 search collections, documents). It is desirable, when the data is accessed 
 directly by users accessing the underlying data files (i.e. from a MapReduce 
 job), that the permission of the data files map to the permissions of the 
 corresponding data entity (i.e. table, column family or search collection).
 To enable this we need to have the necessary hooks in place in the NameNode 
 to delegate authorization to an external system that can map HDFS 
 files/directories to data entities and resolve their permissions based on the 
 data entities permissions.
 I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110950#comment-14110950
 ] 

Hadoop QA commented on HDFS-6776:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664394/HDFS-6776.009.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7767//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7767//console

This message is automatically generated.

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110949#comment-14110949
 ] 

Hadoop QA commented on HDFS-6776:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664395/HDFS-6776.009.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.security.TestRefreshUserMappings
  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7768//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7768//console

This message is automatically generated.

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at

[jira] [Commented] (HDFS-6773) MiniDFSCluster should skip edit log fsync by default


[ 
https://issues.apache.org/jira/browse/HDFS-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110953#comment-14110953
 ] 

Stephen Chu commented on HDFS-6773:
---

The above two test failures aren't related to this patch. I ran them locally 
successfully to double-check.

 MiniDFSCluster should skip edit log fsync by default
 

 Key: HDFS-6773
 URL: https://issues.apache.org/jira/browse/HDFS-6773
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Stephen Chu
 Attachments: HDFS-6773.1.patch, HDFS-6773.2.patch, HDFS-6773.2.patch


 The mini cluster is unnecessarily running with durable edit logs.  The 
 following change cut runtime of a single test from ~30s to ~10s.
 {code}EditLogFileOutputStream.setShouldSkipFsyncForTesting(true);{code}
 The mini cluster should default to this behavior after identifying the few 
 edit log tests that probably depend on durable logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms


[ 
https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110970#comment-14110970
 ] 

Yongjun Zhang commented on HDFS-6694:
-

Hi Arpit, thanks for your earlier review, would you please help committing it? 
Thanks.

 TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently 
 with various symptoms
 

 Key: HDFS-6694
 URL: https://issues.apache.org/jira/browse/HDFS-6694
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Blocker
 Fix For: 2.6.0

 Attachments: HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, 
 HDFS-6694.001.dbg.patch, HDFS-6694.002.dbg.patch, 
 org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, 
 org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt


 TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently 
 with various symptoms. Typical failures are described in first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms


[ 
https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111006#comment-14111006
 ] 

Arpit Agarwal commented on HDFS-6694:
-

The repo is not open for commits.

 TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently 
 with various symptoms
 

 Key: HDFS-6694
 URL: https://issues.apache.org/jira/browse/HDFS-6694
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Priority: Blocker
 Fix For: 2.6.0

 Attachments: HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, 
 HDFS-6694.001.dbg.patch, HDFS-6694.002.dbg.patch, 
 org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, 
 org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt


 TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently 
 with various symptoms. Typical failures are described in first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111022#comment-14111022
 ] 

Colin Patrick McCabe commented on HDFS-6912:


bq. Colin Patrick McCabe: this is a machine without any swap.

I took another look at the code, and it looks like we're creating a sparse file 
by using {{ftruncate}}.  That, in turn, leads to the SIGBUS later when we try 
to access the offset in the file, and no memory is available to de-sparsify it. 
 To remedy this, I added a call to {{posix_fallocate}}.  This will lead to the 
space in memory being allocated at the time we create the shared file 
descriptor, rather than later when we read from it.

Because you are out of memory, you'll still get a failure... but the failure 
will happen during allocation, not later, and it will be an exception which is 
handled cleanly, not a SIGBUS which shuts down the JVM.  See if this patch 
works for you.

bq. The commit seems to be one of yours, can you explain why this suggests 
/dev/shm?

The configuration default is in {{/dev/shm}} because that is present on every 
modern Linux installation.  We always want the shared memory segment FD to be 
in memory, rather than on disk.  We have to read from this thing prior to every 
short-circuit read, so it needs to be fast.  ramfs would have been better, but 
this would require special setup which most users don't want to do right now.  
Maybe this will change if we start recommending ramfs for HDFS-5851.   Anyway, 
ramfs and tmpfs will behave similarly when swap is off, as in your case.

 HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
 ---

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V

 The short-circuit reader throws SIGBUS errors from Unsafe code and crashes 
 the JVM when tmpfs on a disk is depleted.
 {code}
 ---  T H R E A D  ---
 Current thread (0x7eff387df800):  JavaThread xxx daemon [_thread_in_vm, 
 id=5880, stack(0x7eff28b93000,0x7eff28c94000)]
 siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), 
 si_addr=0x7eff3e51d000
 {code}
 The entire backtrace of the JVM crash is
 {code}
 Stack: [0x7eff28b93000,0x7eff28c94000],  sp=0x7eff28c90a10,  free 
 space=1014k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
 j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
 j  
 org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
 j

[jira] [Updated] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Assignee: Colin Patrick McCabe
  Status: Patch Available  (was: Open)

 HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
 ---

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6912.001.patch


 The short-circuit reader throws SIGBUS errors from Unsafe code and crashes 
 the JVM when tmpfs on a disk is depleted.
 {code}
 ---  T H R E A D  ---
 Current thread (0x7eff387df800):  JavaThread xxx daemon [_thread_in_vm, 
 id=5880, stack(0x7eff28b93000,0x7eff28c94000)]
 siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), 
 si_addr=0x7eff3e51d000
 {code}
 The entire backtrace of the JVM crash is
 {code}
 Stack: [0x7eff28b93000,0x7eff28c94000],  sp=0x7eff28c90a10,  free 
 space=1014k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
 j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
 j  
 org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87
 j  
 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291
 j  
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83
 j  org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15
 {code}
 This can be easily reproduced by starting the DataNode, filling up tmpfs (dd 
 if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6912) HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Attachment: HDFS-6912.001.patch

 HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe usage
 ---

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
 Attachments: HDFS-6912.001.patch


 The short-circuit reader throws SIGBUS errors from Unsafe code and crashes 
 the JVM when tmpfs on a disk is depleted.
 {code}
 ---  T H R E A D  ---
 Current thread (0x7eff387df800):  JavaThread xxx daemon [_thread_in_vm, 
 id=5880, stack(0x7eff28b93000,0x7eff28c94000)]
 siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), 
 si_addr=0x7eff3e51d000
 {code}
 The entire backtrace of the JVM crash is
 {code}
 Stack: [0x7eff28b93000,0x7eff28c94000],  sp=0x7eff28c90a10,  free 
 space=1014k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
 j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
 j  
 org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87
 j  
 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291
 j  
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83
 j  org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15
 {code}
 This can be easily reproduced by starting the DataNode, filling up tmpfs (dd 
 if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Summary: SharedFileDescriptorFactory should not allocate sparse files  
(was: HDFS Short-circuit read implementation throws SIGBUS from misc.Unsafe 
usage)

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6912.001.patch


 The short-circuit reader throws SIGBUS errors from Unsafe code and crashes 
 the JVM when tmpfs on a disk is depleted.
 {code}
 ---  T H R E A D  ---
 Current thread (0x7eff387df800):  JavaThread xxx daemon [_thread_in_vm, 
 id=5880, stack(0x7eff28b93000,0x7eff28c94000)]
 siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), 
 si_addr=0x7eff3e51d000
 {code}
 The entire backtrace of the JVM crash is
 {code}
 Stack: [0x7eff28b93000,0x7eff28c94000],  sp=0x7eff28c90a10,  free 
 space=1014k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
 j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
 j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
 j  
 org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
 j  
 org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
 j  
 org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175
 j  
 org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87
 j  
 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291
 j  
 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83
 j  org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15
 {code}
 This can be easily reproduced by starting the DataNode, filling up tmpfs (dd 
 if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6902) FileWriter should be closed in finally block in BlockReceiver#receiveBlock()


[ 
https://issues.apache.org/jira/browse/HDFS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111035#comment-14111035
 ] 

Colin Patrick McCabe commented on HDFS-6902:


+1.  thanks

 FileWriter should be closed in finally block in BlockReceiver#receiveBlock()
 

 Key: HDFS-6902
 URL: https://issues.apache.org/jira/browse/HDFS-6902
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: HDFS-6902.1.patch, HDFS-6902.2.patch


 Here is code starting from line 828:
 {code}
 try {
   FileWriter out = new FileWriter(restartMeta);
   // write out the current time.
   out.write(Long.toString(Time.now() + restartBudget));
   out.flush();
   out.close();
 } catch (IOException ioe) {
 {code}
 If write() or flush() call throws IOException, out wouldn't be closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone


 [ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6851:
---

Attachment: HDFS-6851.000.patch

Posting .000 patch for a testpatch run.

 Flush EncryptionZoneWithId and add an id field to EncryptionZone
 

 Key: HDFS-6851
 URL: https://issues.apache.org/jira/browse/HDFS-6851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6851.000.patch


 EncryptionZoneWithId can be flushed by moving the id field up to 
 EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Description: 
SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
can lead to a SIGBUS later in the short-circuit reader when we try to read from 
the sparse file and memory is not available.

Note that if swap is enabled, we can still get a SIGBUS even with a non-sparse 
file, since the JVM uses MAP_NORESERVE in mmap.

  was:
The short-circuit reader throws SIGBUS errors from Unsafe code and crashes the 
JVM when tmpfs on a disk is depleted.

{code}
---  T H R E A D  ---

Current thread (0x7eff387df800):  JavaThread xxx daemon [_thread_in_vm, 
id=5880, stack(0x7eff28b93000,0x7eff28c94000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), 
si_addr=0x7eff3e51d000
{code}

The entire backtrace of the JVM crash is

{code}
Stack: [0x7eff28b93000,0x7eff28c94000],  sp=0x7eff28c90a10,  free 
space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x88232c]  Unsafe_GetLongVolatile+0x6c
j  sun.misc.Unsafe.getLongVolatile(Ljava/lang/Object;J)J+0
j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.setFlag(J)V+8
j  org.apache.hadoop.hdfs.ShortCircuitShm$Slot.makeValid()V+4
j  
org.apache.hadoop.hdfs.ShortCircuitShm.allocAndRegisterSlot(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+70
j  
org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlotFromExistingShm(Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+38
j  
org.apache.hadoop.hdfs.client.DfsClientShmManager$EndpointShmManager.allocSlot(Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Ljava/lang/String;Lorg/apache/hadoop/hdfs/ExtendedBlockId;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+100
j  
org.apache.hadoop.hdfs.client.DfsClientShmManager.allocSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+102
j  
org.apache.hadoop.hdfs.client.ShortCircuitCache.allocShmSlot(Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;Lorg/apache/hadoop/hdfs/net/DomainPeer;Lorg/apache/commons/lang/mutable/MutableBoolean;Lorg/apache/hadoop/hdfs/ExtendedBlockId;Ljava/lang/String;)Lorg/apache/hadoop/hdfs/ShortCircuitShm$Slot;+18
j  
org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo()Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+151
j  
org.apache.hadoop.hdfs.client.ShortCircuitCache.create(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;Lorg/apache/hadoop/util/Waitable;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+46
j  
org.apache.hadoop.hdfs.client.ShortCircuitCache.fetchOrCreate(Lorg/apache/hadoop/hdfs/ExtendedBlockId;Lorg/apache/hadoop/hdfs/client/ShortCircuitCache$ShortCircuitReplicaCreator;)Lorg/apache/hadoop/hdfs/client/ShortCircuitReplicaInfo;+230
j  
org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal()Lorg/apache/hadoop/hdfs/BlockReader;+175
j  
org.apache.hadoop.hdfs.BlockReaderFactory.build()Lorg/apache/hadoop/hdfs/BlockReader;+87
j  
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(J)Lorg/apache/hadoop/hdfs/protocol/DatanodeInfo;+291
j  
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(Lorg/apache/hadoop/hdfs/DFSInputStream$ReaderStrategy;II)I+83
j  org.apache.hadoop.hdfs.DFSInputStream.read([BII)I+15
{code}

This can be easily reproduced by starting the DataNode, filling up tmpfs (dd 
if=/dev/zero bs=1M of=/dev/shm/dummy.zero) and running a simple task.

   Priority: Minor  (was: Major)

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone


 [ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6851:
---

 Target Version/s: 3.0.0  (was: fs-encryption (HADOOP-10150 and HDFS-6134))
Affects Version/s: (was: fs-encryption (HADOOP-10150 and HDFS-6134))
   3.0.0
   Status: Patch Available  (was: Open)

 Flush EncryptionZoneWithId and add an id field to EncryptionZone
 

 Key: HDFS-6851
 URL: https://issues.apache.org/jira/browse/HDFS-6851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6851.000.patch


 EncryptionZoneWithId can be flushed by moving the id field up to 
 EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111039#comment-14111039
 ] 

Hadoop QA commented on HDFS-6606:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664402/HDFS-6606.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7769//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7769//console

This message is automatically generated.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, 
 OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request


[ 
https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111058#comment-14111058
 ] 

Hadoop QA commented on HDFS-6892:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664273/HDFS-6892.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7771//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7771//console

This message is automatically generated.

 Add XDR packaging method for each NFS request
 -

 Key: HDFS-6892
 URL: https://issues.apache.org/jira/browse/HDFS-6892
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 2.2.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, 
 HDFS-6892.003.patch


 This method can be used for unit tests.
 Most request implements this by overriding RequestWithHandle#serialize() 
 method. However, some request classes missed it, e.g., COMMIT3Request, 
 MKDIR3Request,READDIR3Request, READDIRPLUS3Request, 
 RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request.  RENAME3Reqeust 
 is another example. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6947) Enhance HAR integration with encryption zones

2014-08-26 Thread Andrew Wang (JIRA)

Andrew Wang created HDFS-6947:
-

 Summary: Enhance HAR integration with encryption zones
 Key: HDFS-6947
 URL: https://issues.apache.org/jira/browse/HDFS-6947
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Charles Lamb






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone


[ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111064#comment-14111064
 ] 

Hadoop QA commented on HDFS-6851:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664435/HDFS-6851.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7773//console

This message is automatically generated.

 Flush EncryptionZoneWithId and add an id field to EncryptionZone
 

 Key: HDFS-6851
 URL: https://issues.apache.org/jira/browse/HDFS-6851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6851.000.patch


 EncryptionZoneWithId can be flushed by moving the id field up to 
 EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6922) Add LazyPersist flag to INodeFile, save it in FsImage and edit logs


 [ 
https://issues.apache.org/jira/browse/HDFS-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6922:


Attachment: HDFS-6922.02.patch

Thanks for reviewing Vinayakumar. Good catch on #1. I updated the patch.

Not sure what you mean by the second comment. Which Java naming convention?

 Add LazyPersist flag to INodeFile, save it in FsImage and edit logs
 ---

 Key: HDFS-6922
 URL: https://issues.apache.org/jira/browse/HDFS-6922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6922.01.patch, HDFS-6922.02.patch


 Support for saving the LazyPersist flag in the FsImage and edit logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6898) DN must reserve space for a full block when an RBW block is created


[ 
https://issues.apache.org/jira/browse/HDFS-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111072#comment-14111072
 ] 

Arpit Agarwal commented on HDFS-6898:
-

Yes it may be helpful to have reservation for tmp files also. I'll file a 
separate Jira to look into it.

 DN must reserve space for a full block when an RBW block is created
 ---

 Key: HDFS-6898
 URL: https://issues.apache.org/jira/browse/HDFS-6898
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: Gopal V
Assignee: Arpit Agarwal
 Attachments: HDFS-6898.01.patch, HDFS-6898.03.patch, 
 HDFS-6898.04.patch, HDFS-6898.05.patch


 DN will successfully create two RBW blocks on the same volume even if the 
 free space is sufficient for just one full block.
 One or both block writers may subsequently get a DiskOutOfSpace exception. 
 This can be avoided by allocating space up front.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6865) Byte array native checksumming on client side (HDFS changes)

2014-08-26 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111085#comment-14111085
 ] 

Todd Lipcon commented on HDFS-6865:
---

Thanks for doing the diligence on the performance tests. Looks like this will 
be a good speedup across the board. A few comments:

- In the FSOutputSummer constructor, aren't checksumSize and maxChunkSize now 
redundant with the DataChecksum object that's passed in? {{checksumSize}} 
should be the same as {{sum.getChecksumSize()}} and {{maxChunkSize}} should be 
the same as {{sum.getBytesPerChecksum()}}, no?

- Similarly, in the FSOutputSummer class, it seems like the member variables of 
the same names are redundantr with the {{sum}} member variable.

- Can you mark {{sum}} as {{final}} in FSOutputSummer?

- Shouldn't BUFFER_NUM_CHUNKS be a multiple of 3, since we calculate three 
chunks worth in parallel in the native code? (worth a comment explaining the 
choice, too)



{code}
  private int write1(byte b[], int off, int len) throws IOException {
if(count==0  len=buf.length) {
  // local buffer is empty and user data has one chunk
  // checksum and output data
{code}

This comment is no longer accurate, right? The condition is now that the user 
data has provided data at least as long as our internal buffer.



- {{writeChecksumChunk}} should probably be renamed to {{writeChecksumChunks}} 
and its javadoc get updated.

- It's a little weird that you loop over {{writeChunk}} and pass a single chunk 
per call, though you actually have data ready for multiple chunks, and the API 
itself seems to be perfectly suitable to pass all of the chunks at once. Did 
you want to leave this as a later potential optimization?



{code}
  writeChunk(b, off + i, Math.min(maxChunkSize, len - i), checksum,
  i / maxChunkSize * checksumSize, checksumSize);
{code}

This code might be a little easier to read if you made some local variables:

{code}
  int rem = Math.min(maxChunkSize, len - i);
  int ckOffset = i / maxChunkSize * checksumSize;
  writeChunk(b, off + i, rem, checksum, ckOffset, checksumSize);
{code}



{code}
  /* Forces any buffered output bytes to be checksumed and written out to
   * the underlying output stream.  If keep is true, then the state of 
   * this object remains intact.
{code}

This comment is now inaccurate. If {{keep}} is true, then it retains only the 
last partial chunk worth of buffered data.



- The {{setNumChunksToBuffer}} static thing is kind of sketchy. What if, 
instead, you implemented flush() in FSOutputSummer such that it always flushed 
all completed chunks? (and not any partial last chunk). Then you could make 
those tests call flush() before checkFile(), and not have to break any 
abstractions?


 Byte array native checksumming on client side (HDFS changes)
 

 Key: HDFS-6865
 URL: https://issues.apache.org/jira/browse/HDFS-6865
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client, performance
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6865.2.patch, HDFS-6865.3.patch, HDFS-6865.4.patch, 
 HDFS-6865.5.patch, HDFS-6865.patch


 Refactor FSOutputSummer to buffer data and use the native checksum 
 calculation functionality introduced in HADOOP-10975.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6923) Propagate LazyPersist flag to DNs via DataTransferProtocol


 [ 
https://issues.apache.org/jira/browse/HDFS-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6923:


Attachment: HDFS-6923.02.patch

Rebased patch.

 Propagate LazyPersist flag to DNs via DataTransferProtocol
 --

 Key: HDFS-6923
 URL: https://issues.apache.org/jira/browse/HDFS-6923
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6923.01.patch, HDFS-6923.02.patch


 If the LazyPersist flag is set in the file properties, the DFSClient will 
 propagate it to the DataNode via DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request

2014-08-26 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1402#comment-1402
 ] 

Haohui Mai commented on HDFS-6892:
--

Looks good to me. I think there are multiple places that the code can be 
simplified by merging the declarations and definitions:

{code}
+FileHandle handle = null;
+handle = readHandle(xdr);
{code}

to 

{code}
FileHandle handle = readHandle(xdr);
{code}

And
{code}
+FileHandle handle = null;
+long cookie;
+long cookieVerf;
+int count;
+handle = readHandle(xdr);
 cookie = xdr.readHyper();
 cookieVerf = xdr.readHyper();
 count = xdr.readInt();
{code}

to

{code}
FileHandlehandle = readHandle(xdr);
long cookie = xdr.readHyper();
long cookieVerf = xdr.readHyper();
int count = xdr.readInt();
{code}

+1 once addressed.

 Add XDR packaging method for each NFS request
 -

 Key: HDFS-6892
 URL: https://issues.apache.org/jira/browse/HDFS-6892
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 2.2.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, 
 HDFS-6892.003.patch


 This method can be used for unit tests.
 Most request implements this by overriding RequestWithHandle#serialize() 
 method. However, some request classes missed it, e.g., COMMIT3Request, 
 MKDIR3Request,READDIR3Request, READDIRPLUS3Request, 
 RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request.  RENAME3Reqeust 
 is another example. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6925) DataNode should attempt to place replicas on transient storage first if lazyPersist flag is received


 [ 
https://issues.apache.org/jira/browse/HDFS-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6925:


Attachment: HDFS-6925.02.patch

Thanks for reviewing [~jnp]!

Updated patch to remove unnecessary edit to VolumeChoosingPolicy. The while 
loop in createRbw is to allow fallback to disk. We'll execute it at the most 
twice. I think it simplifies the failure handling.



 DataNode should attempt to place replicas on transient storage first if 
 lazyPersist flag is received
 

 Key: HDFS-6925
 URL: https://issues.apache.org/jira/browse/HDFS-6925
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
 Environment: If the LazyPersist flag is received via 
 DataTransferProtocol then DN should attempt to place the files on RAM disk 
 first, and failing that on regular disk.
 Support for lazily moving replicas from RAM disk to persistent storage will 
 be added later.
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6925.01.patch, HDFS-6925.02.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files

2014-08-26 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1407#comment-1407
 ] 

Todd Lipcon commented on HDFS-6912:
---

Hey Colin. Did you verify that tmpfs supports fallocate going back to old 
versions? Looking at the kernel git history, it was only added in mid 2012 
(e2d12e22c59ce714008aa5266d769f8568d74eac) corresponding to version 3.5. So, 
I'm not sure if it would be supported on el6 for example (maybe they backported 
it, maybe not).

Doing a normal posix write() call to write some explicit zeros to the fd might 
be more portable and shouldn't really have any performance downside.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1418#comment-1418
 ] 

Hadoop QA commented on HDFS-6912:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664433/HDFS-6912.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7772//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7772//console

This message is automatically generated.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6948) DN rejects blocks if it has older UC block

2014-08-26 Thread Daryn Sharp (JIRA)

Daryn Sharp created HDFS-6948:
-

 Summary: DN rejects blocks if it has older UC block
 Key: HDFS-6948
 URL: https://issues.apache.org/jira/browse/HDFS-6948
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp


DNs appear to always reject blocks, even with newer genstamps, if it already 
has a UC copy in its tmp dir.

{noformat}ReplicaAlreadyExistsException: Block
XXX already
exists in state TEMPORARY and thus cannot be created{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions


[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495
 ] 

Hadoop QA commented on HDFS-6826:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664424/HDFS-6826v7.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7770//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7770//console

This message is automatically generated.

 Plugin interface to enable delegation of HDFS authorization assertions
 --

 Key: HDFS-6826
 URL: https://issues.apache.org/jira/browse/HDFS-6826
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.4.1
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
 HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, 
 HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, 
 HDFS-6826v7.4.patch, HDFS-6826v7.patch, HDFS-6826v8.patch, 
 HDFSPluggableAuthorizationProposal-v2.pdf, 
 HDFSPluggableAuthorizationProposal.pdf


 When Hbase data, HiveMetaStore data or Search data is accessed via services 
 (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
 permissions on corresponding entities (databases, tables, views, columns, 
 search collections, documents). It is desirable, when the data is accessed 
 directly by users accessing the underlying data files (i.e. from a MapReduce 
 job), that the permission of the data files map to the permissions of the 
 corresponding data entity (i.e. table, column family or search collection).
 To enable this we need to have the necessary hooks in place in the NameNode 
 to delegate authorization to an external system that can map HDFS 
 files/directories to data entities and resolve their permissions based on the 
 data entities permissions.
 I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6928) 'hdfs put' command should accept lazyPersist flag for testing


 [ 
https://issues.apache.org/jira/browse/HDFS-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6928:


Attachment: HDFS-6928.02.patch

Rebased patch.

 'hdfs put' command should accept lazyPersist flag for testing
 -

 Key: HDFS-6928
 URL: https://issues.apache.org/jira/browse/HDFS-6928
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6928.01.patch, HDFS-6928.02.patch


 Add a '-l' flag to 'hdfs put' which creates the file with the LAZY_PERSIST 
 option.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request


[ 
https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111231#comment-14111231
 ] 

Brandon Li commented on HDFS-6892:
--

Uploaded a new patch to address Haohui's comments.

 Add XDR packaging method for each NFS request
 -

 Key: HDFS-6892
 URL: https://issues.apache.org/jira/browse/HDFS-6892
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 2.2.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, 
 HDFS-6892.003.patch, HDFS-6892.004.patch


 This method can be used for unit tests.
 Most request implements this by overriding RequestWithHandle#serialize() 
 method. However, some request classes missed it, e.g., COMMIT3Request, 
 MKDIR3Request,READDIR3Request, READDIRPLUS3Request, 
 RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request.  RENAME3Reqeust 
 is another example. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6892) Add XDR packaging method for each NFS request


 [ 
https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6892:
-

Attachment: HDFS-6892.004.patch

 Add XDR packaging method for each NFS request
 -

 Key: HDFS-6892
 URL: https://issues.apache.org/jira/browse/HDFS-6892
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 2.2.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, 
 HDFS-6892.003.patch, HDFS-6892.004.patch


 This method can be used for unit tests.
 Most request implements this by overriding RequestWithHandle#serialize() 
 method. However, some request classes missed it, e.g., COMMIT3Request, 
 MKDIR3Request,READDIR3Request, READDIRPLUS3Request, 
 RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request.  RENAME3Reqeust 
 is another example. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6929) NN periodically unlinks lazy persist files with missing replicas from namespace


 [ 
https://issues.apache.org/jira/browse/HDFS-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6929:


Attachment: HDFS-6929.02.patch

Updated patch to allow turning off the scrubber, document the option.

 NN periodically unlinks lazy persist files with missing replicas from 
 namespace
 ---

 Key: HDFS-6929
 URL: https://issues.apache.org/jira/browse/HDFS-6929
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: HDFS-6581

 Attachments: HDFS-6929.01.patch, HDFS-6929.02.patch


 Occasional data loss is expected when using the lazy persist flag due to node 
 restarts. The NN will optionally unlink lazy persist files from the namespace 
 to avoid them from showing up as corrupt files.
 This behavior can be turned off with a global option. In the future this may 
 be made a per-file option controllable by the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone


 [ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6851:
---

Attachment: (was: HDFS-6851.000.patch)

 Flush EncryptionZoneWithId and add an id field to EncryptionZone
 

 Key: HDFS-6851
 URL: https://issues.apache.org/jira/browse/HDFS-6851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb

 EncryptionZoneWithId can be flushed by moving the id field up to 
 EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone


 [ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6851:
---

Attachment: HDFS-6851.000.patch

Redo the .000 patch. The last one didn't include the two deleted files.

 Flush EncryptionZoneWithId and add an id field to EncryptionZone
 

 Key: HDFS-6851
 URL: https://issues.apache.org/jira/browse/HDFS-6851
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6851.000.patch


 EncryptionZoneWithId can be flushed by moving the id field up to 
 EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111284#comment-14111284
 ] 

Colin Patrick McCabe commented on HDFS-6912:


bq. Hey Colin. Did you verify that tmpfs supports fallocate going back to old 
versions? Looking at the kernel git history, it was only added in mid 2012 
(e2d12e22c59ce714008aa5266d769f8568d74eac) corresponding to version 3.5. So, 
I'm not sure if it would be supported on el6 for example (maybe they backported 
it, maybe not).

I believe the glibc {{posix_fallocate}} wrapper falls back to using {{write()}} 
calls when {{fallocate}} itself is not supported by the kernel.  There is some 
discussion here:  
https://lists.gnu.org/archive/html/bug-coreutils/2009-05/msg00207.html which 
talks about:

bq. i.e. fall back to using write() as the glibc posix_fallocate() 
implementation does.

But, I think it's simpler to just use {{write}} here.  Any performance 
advantage to using {{ftruncate}} + {{fallocate}} is going to be extremely tiny 
(or nonexistent) since this file is only 8192 bytes.  And {{write}} is much 
more portable.  So here is a new version that does that.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6892) Add XDR packaging method for each NFS request


[ 
https://issues.apache.org/jira/browse/HDFS-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111299#comment-14111299
 ] 

Hadoop QA commented on HDFS-6892:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664461/HDFS-6892.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7774//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7774//console

This message is automatically generated.

 Add XDR packaging method for each NFS request
 -

 Key: HDFS-6892
 URL: https://issues.apache.org/jira/browse/HDFS-6892
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Affects Versions: 2.2.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-6892.001.patch, HDFS-6892.002.patch, 
 HDFS-6892.003.patch, HDFS-6892.004.patch


 This method can be used for unit tests.
 Most request implements this by overriding RequestWithHandle#serialize() 
 method. However, some request classes missed it, e.g., COMMIT3Request, 
 MKDIR3Request,READDIR3Request, READDIRPLUS3Request, 
 RMDIR3RequestREMOVE3Request, SETATTR3Request,SYMLINK3Request.  RENAME3Reqeust 
 is another example. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Attachment: HDFS-6912.002.patch

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover

2014-08-26 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6911:
--

Attachment: h6911_20140827.patch

h6911_20140827.patch: adds a new test for ScheduleSameBlock.  Also adds another 
new test for ChooseExcess.

 Archival Storage: check if a block is already scheduled in Mover
 

 Key: HDFS-6911
 URL: https://issues.apache.org/jira/browse/HDFS-6911
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer, namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h6911_20140823.patch, h6911_20140827.patch


 Similar to balancer, Mover should remember all blocks already scheduled to 
 move (movedBlocks). Then, check it before schedule a new block move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111370#comment-14111370
 ] 

Hadoop QA commented on HDFS-6912:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664475/HDFS-6912.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.io.nativeio.TestSharedFileDescriptorFactory

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7776//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7776//console

This message is automatically generated.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.


 [ 
https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6808:


Attachment: HDFS-6808.000.combo.patch

 Add command line option to ask DataNode reload configuration.
 -

 Key: HDFS-6808
 URL: https://issues.apache.org/jira/browse/HDFS-6808
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch


 The workflow of dynamically changing data volumes on DataNode is
 # Users manually changed {{dfs.datanode.data.dir}} in the configuration file
 # User use command line to notify DN to reload configuration and updates its 
 volumes. 
 This work adds command line support to notify DN to reload configuration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.


 [ 
https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6808:


Attachment: HDFS-6808.000.patch

Update patch to add command line supports.

 Add command line option to ask DataNode reload configuration.
 -

 Key: HDFS-6808
 URL: https://issues.apache.org/jira/browse/HDFS-6808
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch


 The workflow of dynamically changing data volumes on DataNode is
 # Users manually changed {{dfs.datanode.data.dir}} in the configuration file
 # User use command line to notify DN to reload configuration and updates its 
 volumes. 
 This work adds command line support to notify DN to reload configuration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6808) Add command line option to ask DataNode reload configuration.


 [ 
https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6808:


Affects Version/s: (was: 2.4.1)
   2.5.0
   Status: Patch Available  (was: Open)

 Add command line option to ask DataNode reload configuration.
 -

 Key: HDFS-6808
 URL: https://issues.apache.org/jira/browse/HDFS-6808
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.5.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch


 The workflow of dynamically changing data volumes on DataNode is
 # Users manually changed {{dfs.datanode.data.dir}} in the configuration file
 # User use command line to notify DN to reload configuration and updates its 
 volumes. 
 This work adds command line support to notify DN to reload configuration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk


[ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111455#comment-14111455
 ] 

Stephen Chu commented on HDFS-6946:
---

Similar to HDFS-5803, where TestBalancer#TIMEOUT was bumped from 20s to 40s.

We can run TestBalancer between current trunk and the time when HDFS-5803 was 
fixed to see if there is a performance regression while taking into account 
test code changes. If there isn't a regression, perhaps we should bump up the 
timeout.

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6946) TestBalancerWithSaslDataTransfer fails in trunk


 [ 
https://issues.apache.org/jira/browse/HDFS-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-6946:
-

Assignee: Stephen Chu

 TestBalancerWithSaslDataTransfer fails in trunk
 ---

 Key: HDFS-6946
 URL: https://issues.apache.org/jira/browse/HDFS-6946
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Stephen Chu
Priority: Minor

 From build #1849 :
 {code}
 REGRESSION:  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity
 Error Message:
 Cluster failed to reached expected values of totalSpace (current: 750, 
 expected: 750), or usedSpace (current: 140, expected: 150), in more than 
 4 msec.
 Stack Trace:
 java.util.concurrent.TimeoutException: Cluster failed to reached expected 
 values of totalSpace (current: 750, expected: 750), or usedSpace (current: 
 140, expected: 150), in more than 4 msec.
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:253)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancer(TestBalancer.java:578)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:551)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancer0Internal(TestBalancer.java:759)
 at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer.testBalancer0Integrity(TestBalancerWithSaslDataTransfer.java:34)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes


 [ 
https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6727:


 Target Version/s: 3.0.0, 2.6.0  (was: 2.6.0)
Affects Version/s: 2.5.0
   Status: Patch Available  (was: Open)

 Refresh data volumes on DataNode based on configuration changes
 ---

 Key: HDFS-6727
 URL: https://issues.apache.org/jira/browse/HDFS-6727
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.1, 2.5.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
  Labels: datanode
 Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch


 HDFS-1362 requires DataNode to reload configuration file during the runtime, 
 so that DN can change the data volumes dynamically. This JIRA reuses the 
 reconfiguration framework introduced by HADOOP-7001 to enable DN to 
 reconfigure at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes


 [ 
https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-6727:


Attachment: HDFS-6727.combo.patch

Update a combo patch against trunk.

 Refresh data volumes on DataNode based on configuration changes
 ---

 Key: HDFS-6727
 URL: https://issues.apache.org/jira/browse/HDFS-6727
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.5.0, 2.4.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
  Labels: datanode
 Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch


 HDFS-1362 requires DataNode to reload configuration file during the runtime, 
 so that DN can change the data volumes dynamically. This JIRA reuses the 
 reconfiguration framework introduced by HADOOP-7001 to enable DN to 
 reconfigure at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6894) Add XDR parser method for each NFS response


 [ 
https://issues.apache.org/jira/browse/HDFS-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6894:
-

Description: This can be an abstract method in NFS3Response to force the 
subclasses to implement.

 Add XDR parser method for each NFS response
 ---

 Key: HDFS-6894
 URL: https://issues.apache.org/jira/browse/HDFS-6894
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li

 This can be an abstract method in NFS3Response to force the subclasses to 
 implement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6894) Add XDR parser method for each NFS response


 [ 
https://issues.apache.org/jira/browse/HDFS-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6894:
-

Environment: (was: This can be an abstract method in NFS3Response to 
force the subclasses to implement.)

 Add XDR parser method for each NFS response
 ---

 Key: HDFS-6894
 URL: https://issues.apache.org/jira/browse/HDFS-6894
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6891) Follow-on work for transparent data at rest encryption

2014-08-26 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6891:
--

Component/s: encryption

 Follow-on work for transparent data at rest encryption
 --

 Key: HDFS-6891
 URL: https://issues.apache.org/jira/browse/HDFS-6891
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Charles Lamb

 This is an umbrella JIRA to track remaining subtasks from HDFS-6134.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---

Attachment: HDFS-6912.003.patch

The unit test was relying on the file position being 0.  I don't think anything 
else relies on this (we use mmap to access this) but in v3 of the patch, I made 
it restore the file position to 0 just for simplicity.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, 
 HDFS-6912.003.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-08-26 Thread Xiaoyu Yao (JIRA)

Xiaoyu Yao created HDFS-6950:


 Summary: Add Additional unit tests for HDFS-6581
 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Xiaoyu Yao


Create additional unit tests for HDFS-6581 in addition to existing ones in 
HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581


 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:


Issue Type: Sub-task  (was: Bug)
Parent: HDFS-6581

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao

 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581


 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:


Assignee: Xiaoyu Yao

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao

 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111530#comment-14111530
 ] 

Yongjun Zhang commented on HDFS-6776:
-

I'd like to emphasize that with the latest patch of using null token instead 
of NullToken exception, user has to apply the same patch to both source and 
target cluster. With the prior revision that Alejandro commented, that combines 
NullToken and message parsing,, user just need to patch the secure cluster.

Thanks.



 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at

[jira] [Commented] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover

2014-08-26 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111551#comment-14111551
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6911:
---

 ... , maybe a more efficient way here is to track the inode id, ...

I think it is a good idea.  Let's do this improvement separately since it 
cannot reuse the Balancer code. 

 Archival Storage: check if a block is already scheduled in Mover
 

 Key: HDFS-6911
 URL: https://issues.apache.org/jira/browse/HDFS-6911
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer, namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h6911_20140823.patch, h6911_20140827.patch


 Similar to balancer, Mover should remember all blocks already scheduled to 
 move (movedBlocks). Then, check it before schedule a new block move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6929) NN periodically unlinks lazy persist files with missing replicas from namespace

2014-08-26 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111564#comment-14111564
 ] 

Jitendra Nath Pandey commented on HDFS-6929:


+1

 NN periodically unlinks lazy persist files with missing replicas from 
 namespace
 ---

 Key: HDFS-6929
 URL: https://issues.apache.org/jira/browse/HDFS-6929
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: HDFS-6581

 Attachments: HDFS-6929.01.patch, HDFS-6929.02.patch


 Occasional data loss is expected when using the lazy persist flag due to node 
 restarts. The NN will optionally unlink lazy persist files from the namespace 
 to avoid them from showing up as corrupt files.
 This behavior can be turned off with a global option. In the future this may 
 be made a per-file option controllable by the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files


[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111571#comment-14111571
 ] 

Hadoop QA commented on HDFS-6912:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664509/HDFS-6912.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7779//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7779//console

This message is automatically generated.

 SharedFileDescriptorFactory should not allocate sparse files
 

 Key: HDFS-6912
 URL: https://issues.apache.org/jira/browse/HDFS-6912
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching
Affects Versions: 2.5.0
 Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
Reporter: Gopal V
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, 
 HDFS-6912.003.patch


 SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
 can lead to a SIGBUS later in the short-circuit reader when we try to read 
 from the sparse file and memory is not available.
 Note that if swap is enabled, we can still get a SIGBUS even with a 
 non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6776) distcp from insecure cluster (source) to secure cluster (destination) doesn't work via webhdfs

2014-08-26 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111578#comment-14111578
 ] 

Alejandro Abdelnur commented on HDFS-6776:
--

IMO, enabling to work with an unpatched cluster (via message parsing) is a 
desirable capability as it does not require users to upgrade older clusters if 
they are just reading data from them.

 distcp from insecure cluster (source) to secure cluster (destination) doesn't 
 work via webhdfs
 --

 Key: HDFS-6776
 URL: https://issues.apache.org/jira/browse/HDFS-6776
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0, 2.5.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-6776.001.patch, HDFS-6776.002.patch, 
 HDFS-6776.003.patch, HDFS-6776.004.patch, HDFS-6776.004.patch, 
 HDFS-6776.005.patch, HDFS-6776.006.NullToken.patch, 
 HDFS-6776.006.NullToken.patch, HDFS-6776.007.patch, HDFS-6776.008.patch, 
 HDFS-6776.009.patch, dummy-token-proxy.js


 Issuing distcp command at the secure cluster side, trying to copy stuff from 
 insecure cluster to secure cluster, and see the following problem:
 {code}
 hadoopuser@yjc5u-1 ~]$ hadoop distcp webhdfs://insure-cluster:port/tmp 
 hdfs://sure-cluster:8020/tmp/tmptgt
 14/07/30 20:06:19 INFO tools.DistCp: Input Options: 
 DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
 ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
 copyStrategy='uniformsize', sourceFileListing=null, 
 sourcePaths=[webhdfs://insecure-cluster:port/tmp], 
 targetPath=hdfs://secure-cluster:8020/tmp/tmptgt, targetPathExists=true}
 14/07/30 20:06:19 INFO client.RMProxy: Connecting to ResourceManager at 
 secure-clister:8032
 14/07/30 20:06:20 WARN ssl.FileBasedKeyStoresFactory: The property 
 'ssl.client.truststore.location' has not been set, no TrustStore will be 
 loaded
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 WARN security.UserGroupInformation: 
 PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
 cause:java.io.IOException: Failed to get the token for hadoopuser, 
 user=hadoopuser
 14/07/30 20:06:20 ERROR tools.DistCp: Exception encountered 
 java.io.IOException: Failed to get the token for hadoopuser, user=hadoopuser
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:365)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:84)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:618)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:584)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:466)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:462)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1132)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:218)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:403)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:424)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:640)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:565)
   at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:438)
   at

[jira] [Commented] (HDFS-6920) Archival Storage: check the storage type of delNodeHintStorage when deleting a replica


[ 
https://issues.apache.org/jira/browse/HDFS-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111612#comment-14111612
 ] 

Jing Zhao commented on HDFS-6920:
-

The patch looks good to me. +1

Can we also have a unit test for this?

 Archival Storage: check the storage type of delNodeHintStorage when deleting 
 a replica
 --

 Key: HDFS-6920
 URL: https://issues.apache.org/jira/browse/HDFS-6920
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h6920_20140823.patch


 in BlockManager.chooseExcessReplicates, it does not check the storage type of 
 delNodeHintStorage.  Therefore,  delNodeHintStorage could possibly be chosen 
 even if its storage type is not an excess storage type.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6634) inotify in HDFS


[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111634#comment-14111634
 ] 

Colin Patrick McCabe commented on HDFS-6634:


The new design doc looks really good.

{code}
+@InterfaceAudience.Public
+@InterfaceStability.Evolving
+public class MissingEventsException extends Exception {
{code}
Since this is part of the public API, we should have a friendly toString method 
that prints something like inotify was unable to locate some events.  We 
expected txid X, but were only able to read up to txid Y

Re: INVALID_TXID.  I think that we don't need to add this to the proto file as 
I suggested earlier.  The only way to add it would be as an enum value, which 
seems like kind of a hack.  So it's fine as-is.

Re: the QuorumJournalManager changes: [~tlipcon], [~james.thomas], 
[~andrew.wang] and I talked offline about this.  The existing logic in QJM to 
prevent reading uncommitted edits should suffice, so we shouldn't need to add 
the ability to fetch the writer epoch via an RPC.  There should never be 
divergent QJM edit logs... as Todd pointed out, each QJM edit log should be 
up-to-date, or a prefix of an up-to-date log.  We should do something to avoid 
rescanning those in-progress edit logs to find the final txid over and over 
on the JournalNodes, though.

Overall, great work, James... I think this is almost ready to go.

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
 HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.patch, inotify-design.2.pdf, 
 inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, 
 inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6911) Archival Storage: check if a block is already scheduled in Mover


[ 
https://issues.apache.org/jira/browse/HDFS-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111637#comment-14111637
 ] 

Jing Zhao commented on HDFS-6911:
-

+1 for the latest patch.

 Archival Storage: check if a block is already scheduled in Mover
 

 Key: HDFS-6911
 URL: https://issues.apache.org/jira/browse/HDFS-6911
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer, namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h6911_20140823.patch, h6911_20140827.patch


 Similar to balancer, Mover should remember all blocks already scheduled to 
 move (movedBlocks). Then, check it before schedule a new block move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.


[ 
https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111653#comment-14111653
 ] 

Hadoop QA commented on HDFS-6808:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12664492/HDFS-6808.000.combo.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build///testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build///artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build///console

This message is automatically generated.

 Add command line option to ask DataNode reload configuration.
 -

 Key: HDFS-6808
 URL: https://issues.apache.org/jira/browse/HDFS-6808
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.5.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch


 The workflow of dynamically changing data volumes on DataNode is
 # Users manually changed {{dfs.datanode.data.dir}} in the configuration file
 # User use command line to notify DN to reload configuration and updates its 
 volumes. 
 This work adds command line support to notify DN to reload configuration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-08-26 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111655#comment-14111655
]

Sanjay Radia commented on HDFS-6469:

My thoughts:
* I do believe that Paxos based NN would give faster failover than what NN HA
offers today (30sec to a few minutes but typically no more than 1 minute or
two). So this is clearly a benefit of CNode though I have not heard a single
customer complain about the failover time so far.
* The proposed solution does not increase the write throughput.
* The parallel reads advantage of CNode can be achieved in the current HA
setup with some work (this is discussed above). If this is the main benefit
than I rather pursue enhancing the NN standby to support reads. Further there
is existing on going work to improve the locking in the NN.
* I share Todd's view that ZK is not a usable reference implementation for
Paxos. One really needs a paxos library that can be plugged in rather than an
external server-based solution like ZK.

So at this stage I am having a hard time seeing the benefits to justify the
costs of adding this complexity. I do however understand the overhead that
Wandisco faces in integrating their solution with HDFS each time HDFS is
modified. Would a few plugin interfaces make it easier? I would be more than
happy to support adding such plugins if they would help.

Coordinated replication of the namespace using ConsensusNode

Key: HDFS-6469
URL: https://issues.apache.org/jira/browse/HDFS-6469
Project: Hadoop HDFS
Issue Type: New Feature
Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
Attachments: CNodeDesign.pdf

This is a proposal to introduce ConsensusNode - an evolution of the NameNode,
which enables replication of the namespace on multiple nodes of an HDFS
cluster by means of a Coordination Engine.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file


[ 
https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111707#comment-14111707
 ] 

Yi Liu commented on HDFS-6705:
--

if the super user is the owner, then it can't access the file?   
{quote}
 It is settable by any user which has hdfs access to that file.
It can only be set and never removed.
{quote}
Then any user who has hdfs access can easily prevent HDFS admin to access file 
and the admin can't access that file any more.  Could we find a better way?

 Create an XAttr that disallows the HDFS admin from accessing a file
 ---

 Key: HDFS-6705
 URL: https://issues.apache.org/jira/browse/HDFS-6705
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6705.001.patch


 There needs to be an xattr that specifies that the HDFS admin can not access 
 a file. This is needed for m/r delegation tokens and data at rest encryption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones


 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6951:
--

Attachment: HDFS-6951-testrepo.patch

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0

 Attachments: HDFS-6951-testrepo.patch


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 To reproduce:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

Stephen Chu created HDFS-6951:
-

 Summary: Saving namespace and restarting NameNode will remove 
existing encryption zones
 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0
 Attachments: HDFS-6951-testrepo.patch

Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

To reproduce:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones


 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6951:
--

Description: 
Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

I could reproduce this on a pseudo-distributed cluster:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.

  was:
Currently, when users save namespace and restart the NameNode, pre-existing 
encryption zones will be wiped out.

To reproduce:
* Create an encryption zone
* List encryption zones and verify the newly created zone is present
* Save the namespace
* Kill and restart the NameNode
* List the encryption zones and you'll find the encryption zone is missing

I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
well. Removing the saveNamespace call will get the test to pass.


 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
 Fix For: 3.0.0

 Attachments: HDFS-6951-testrepo.patch


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones


 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb reassigned HDFS-6951:
--

Assignee: Charles Lamb

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Fix For: 3.0.0

 Attachments: HDFS-6951-testrepo.patch


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes


[ 
https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111766#comment-14111766
 ] 

Hadoop QA commented on HDFS-6727:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12664501/HDFS-6727.combo.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestEncryptionZones
  org.apache.hadoop.hdfs.server.datanode.TestBPOfferService
  org.apache.hadoop.security.TestRefreshUserMappings
  org.apache.hadoop.hdfs.server.balancer.TestBalancer
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.hdfs.TestCrcCorruption
  org.apache.hadoop.hdfs.TestDataTransferKeepalive

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.fs.viewfs.TestViewFsAtHdfsRoot

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7778//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7778//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7778//console

This message is automatically generated.

 Refresh data volumes on DataNode based on configuration changes
 ---

 Key: HDFS-6727
 URL: https://issues.apache.org/jira/browse/HDFS-6727
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.5.0, 2.4.1
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
  Labels: datanode
 Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.combo.patch


 HDFS-1362 requires DataNode to reload configuration file during the runtime, 
 so that DN can change the data volumes dynamically. This JIRA reuses the 
 reconfiguration framework introduced by HADOOP-7001 to enable DN to 
 reconfigure at runtime.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6944) Archival Storage: add a test framework for testing different migration scenarios


 [ 
https://issues.apache.org/jira/browse/HDFS-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6944:


Attachment: HDFS-6944.001.patch

Update the patch:
# Now for a pendingMove, a target may be scheduled to the same DataNode of an 
existing replica. This is because we currently use 
MovedBlocks.Locations#isLocatedOn, which compares StorageGroup instances. Then 
when we do the data migration the DN actually may complain that it already has 
a replica and fail the migration. A fix here can be doing comparison based on 
StorageGroup#getDataNodeInfo().
# Currently the Mover cannot terminate since Mover#run always returns 
IN_PROGRESS. The patch adds code to wait for the existing migration finish, and 
also adds a simple termination condition.

 Archival Storage: add a test framework for testing different migration 
 scenarios
 

 Key: HDFS-6944
 URL: https://issues.apache.org/jira/browse/HDFS-6944
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-6944.000.patch, HDFS-6944.001.patch


 This jira plans to add a testing framework for testing different scenarios of 
 data migration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6944) Archival Storage: add a test framework for testing different migration scenarios