date:20150828


[ 
https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718106#comment-14718106
 ] 

Hudson commented on HDFS-8900:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2264 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2264/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Compact XAttrs to optimize memory footprint.
 

 Key: HDFS-8900
 URL: https://issues.apache.org/jira/browse/HDFS-8900
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, 
 HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch


 {code}
 private final ImmutableListXAttr xAttrs;
 {code}
 Currently we use above in XAttrFeature, it's not efficient from memory point 
 of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, 
 and each object has memory alignment. 
 We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone

2015-08-28 Thread Lifeng Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lifeng Wang updated HDFS-8987:
--
Summary: Erasure coding: MapReduce job failed when I set the / folder to 
the EC zone   (was: Erasure coding: MapReduce job failed when I set the / foler 
to the EC zone )

 Erasure coding: MapReduce job failed when I set the / folder to the EC zone 
 

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone

2015-08-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HDFS-8987:
-
Description: 
Test progress is as follows
 * For a new cluster, I format the namenode and then start HDFS service.
 * After HDFS service is started, there is no files in  HDFS and set the / 
folder to the EC zone. the EC zone is created successfully.
 * Start the yarn and mr JobHistoryServer services. All the services start 
successfully.
 * Then run hadoop example pi program and it failed.

The following is the exception.
{noformat}
 ```
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
 Cannot set replication to a file with striped blocks
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)

```
{noformat}



  was:
Test progress is as follows
 * For a new cluster, I format the namenode and then start HDFS service.
 * After HDFS service is started, there is no files in  HDFS and set the / 
folder to the EC zone. the EC zone is created successfully.
 * Start the yarn and mr JobHistoryServer services. All the services start 
successfully.
 * Then run hadoop example pi program and it failed.

The following is the exception.
 ```
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
 Cannot set replication to a file with striped blocks
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)

```




 Erasure coding: MapReduce job failed when I set the / foler to the EC zone 
 ---

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```

[jira] [Commented] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone


[ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718113#comment-14718113
 ] 

Zhe Zhang commented on HDFS-8987:
-

Thanks for testing and finding the issue [~Lifeng Wang]. Looks like it's a 
duplicate of HDFS-8937. If you also agree we can close this JIRA.

 Erasure coding: MapReduce job failed when I set the / foler to the EC zone 
 ---

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8960) DFS client says no more good datanodes being available to try on a single drive failure

2015-08-28 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718146#comment-14718146
 ] 

Yongjun Zhang commented on HDFS-8960:
-

Yes, it's trying to do pipeline recovery, see below:

{code}
[yzhang@localhost Downloads]$ grep -B 3 blk_1073817519  r12s16-datanode.log | 
grep firstbadlink
15/08/23 07:21:49 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.5:10110
15/08/23 07:21:49 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.5:10110
15/08/23 07:21:52 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.1:10110
15/08/23 07:21:52 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.1:10110
15/08/23 07:21:55 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.6:10110
15/08/23 07:21:55 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.6:10110
15/08/23 07:21:58 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.8:10110
15/08/23 07:21:58 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.8:10110
15/08/23 07:22:01 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.14:10110
15/08/23 07:22:01 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.14:10110
15/08/23 07:22:04 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.2:10110
15/08/23 07:22:04 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.2:10110
15/08/23 07:22:07 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.9:10110
15/08/23 07:22:07 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.9:10110
15/08/23 07:22:10 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.3:10110
15/08/23 07:22:10 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.3:10110
15/08/23 07:22:13 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.7:10110
15/08/23 07:22:13 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.7:10110
15/08/23 07:22:16 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.10:10110
15/08/23 07:22:16 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.10:10110
15/08/23 07:22:19 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.12:10110
15/08/23 07:22:19 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.12:10110
15/08/23 07:22:23 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.11:10110
15/08/23 07:22:23 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.11:10110
15/08/23 07:22:26 INFO datanode.DataNode: Datanode 2 got response for connect 
ack  from downstream datanode with firstbadlink as 172.24.32.15:10110
15/08/23 07:22:26 INFO datanode.DataNode: Datanode 2 forwarding connect ack to 
upstream firstbadlink is 172.24.32.15:10110
{code}

It happened you loaded r12s13 which is not one of the node in the grepped 
message (per your report, r12s13 is the last node in the initial pipeline), and 
r12s16 is the source node.

Would you please upload a few more DN logs?

Thanks.





 DFS client says no more good datanodes being available to try on a single 
 drive failure
 -

 Key: HDFS-8960
 URL: https://issues.apache.org/jira/browse/HDFS-8960
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.1
 Environment: openjdk version 1.8.0_45-internal
 OpenJDK Runtime Environment (build 1.8.0_45-internal-b14)
 OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)
Reporter: Benoit Sigoure
 Attachments: blk_1073817519_77099.log, r12s13-datanode.log, 
 r12s16-datanode.log


 Since we upgraded to 2.7.1 we regularly see single-drive failures cause 
 widespread problems at

[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8988:
-
Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-8793

 Use LightWeightHashSet instead of LightWeightLinkedSet in 
 BlockManager#excessReplicateMap
 -

 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: HDFS-8988.001.patch


 {code}
 public final MapString, LightWeightLinkedSetBlock excessReplicateMap = 
 new HashMap();
 {code}
 {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
 order, but it requires more memory for each entry (2 references = 8 bytes).  
 We don't need to keep excess replicated blocks in order here, so should use  
 {{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8988:
-
Status: Patch Available  (was: Open)

 Use LightWeightHashSet instead of LightWeightLinkedSet in 
 BlockManager#excessReplicateMap
 -

 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: HDFS-8988.001.patch


 {code}
 public final MapString, LightWeightLinkedSetBlock excessReplicateMap = 
 new HashMap();
 {code}
 {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
 order, but it requires more memory for each entry (2 references = 8 bytes).  
 We don't need to keep excess replicated blocks in order here, so should use  
 {{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails


[ 
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718121#comment-14718121
 ] 

Hadoop QA commented on HDFS-8704:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 41s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 32s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 39s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  5s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 188m 19s | Tests failed in hadoop-hdfs. |
| | | 230m 30s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752923/HDFS-8704-HDFS-7285-006.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 164cbe6 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12189/console |


This message was automatically generated.

 Erasure Coding: client fails to write large file when one datanode fails
 

 Key: HDFS-8704
 URL: https://issues.apache.org/jira/browse/HDFS-8704
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, 
 HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch, 
 HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch


 I test current code on a 5-node cluster using RS(3,2).  When a datanode is 
 corrupt, client succeeds to write a file smaller than a block group but fails 
 to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests 
 files smaller than a block group, this jira will add more test situations.
 A streamer may encounter some bad datanodes when writing blocks allocated to 
 it. When it fails to connect datanode or send a packet, the streamer needs to 
 prepare for the next block. First it removes the packets of current  block 
 from its data queue. If the first packet of next block has already been in 
 the data queue, the streamer will reset its state and start to wait for the 
 next block allocated for it; otherwise it will just wait for the first packet 
 of next block. The streamer will check periodically if it is asked to 
 terminate during its waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone

2015-08-28 Thread Lifeng Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718116#comment-14718116
 ] 

Lifeng Wang commented on HDFS-8987:
---

OK. Please help to close this JIRA.

 Erasure coding: MapReduce job failed when I set the / folder to the EC zone 
 

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8987.
-
Resolution: Duplicate

 Erasure coding: MapReduce job failed when I set the / folder to the EC zone 
 

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors


[ 
https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718102#comment-14718102
 ] 

Hadoop QA commented on HDFS-8965:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 32s | Pre-patch trunk has 2 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 23s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 39s | The applied patch generated  
12 new checkstyle issues (total was 401, now 407). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 27s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 22s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 191m 46s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   6m 33s | Tests passed in bkjournal. |
| | | 246m 37s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.qjournal.server.TestJournal |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752906/HDFS-8965.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 035ed26 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| bkjournal test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/testrun_bkjournal.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12184/console |


This message was automatically generated.

 Harden edit log reading code against out of memory errors
 -

 Key: HDFS-8965
 URL: https://issues.apache.org/jira/browse/HDFS-8965
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, 
 HDFS-8965.003.patch, HDFS-8965.004.patch


 We should harden the edit log reading code against out of memory errors.  Now 
 that each op has a length prefix and a checksum, we can validate the checksum 
 before trying to load the Op data.  This should avoid out of memory errors 
 when trying to load garbage data as Op data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900


[ 
https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718105#comment-14718105
 ] 

Hudson commented on HDFS-8963:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2264 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2264/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Fix incorrect sign extension of xattr length in HDFS-8900
 -

 Key: HDFS-8963
 URL: https://issues.apache.org/jira/browse/HDFS-8963
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Critical
 Fix For: 2.8.0

 Attachments: HDFS-8963.001.patch


 HDFS-8900 introduced two new findbugs warnings:
 https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8988:
-
Description: 
{code}
public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new 
HashMap();
{code}
{{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
order, but it requires more memory for each entry (2 references = 8 bytes).  
We don't need to keep excess replicated blocks in order here, so should use  
{{LightWeightHashSet}}.

  was:
{code}
public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new 
HashMap();
{code}
{{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
order, but it requires more memory for each entry (2 references, totally 8 
bytes).  
We don't need to keep excess replicated blocks in order here, so should use  
{{LightWeightHashSet}}.


 Use LightWeightHashSet instead of LightWeightLinkedSet in 
 BlockManager#excessReplicateMap
 -

 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor

 {code}
 public final MapString, LightWeightLinkedSetBlock excessReplicateMap = 
 new HashMap();
 {code}
 {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
 order, but it requires more memory for each entry (2 references = 8 bytes).  
 We don't need to keep excess replicated blocks in order here, so should use  
 {{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap


 [ 
https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8988:
-
Attachment: HDFS-8988.001.patch

 Use LightWeightHashSet instead of LightWeightLinkedSet in 
 BlockManager#excessReplicateMap
 -

 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: HDFS-8988.001.patch


 {code}
 public final MapString, LightWeightLinkedSetBlock excessReplicateMap = 
 new HashMap();
 {code}
 {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
 order, but it requires more memory for each entry (2 references = 8 bytes).  
 We don't need to keep excess replicated blocks in order here, so should use  
 {{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8987) MapReduce job failed when I set the / foler to the EC zone

2015-08-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HDFS-8987:
-
Issue Type: Sub-task  (was: Bug)
Parent: HDFS-7285

 MapReduce job failed when I set the / foler to the EC zone 
 ---

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone

2015-08-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HDFS-8987:
-
Summary: Erasure coding: MapReduce job failed when I set the / foler to the 
EC zone   (was: MapReduce job failed when I set the / foler to the EC zone )

 Erasure coding: MapReduce job failed when I set the / foler to the EC zone 
 ---

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HDFS-8689) move hasClusterEverBeenMultiRack to NetworkTopology


 [ 
https://issues.apache.org/jira/browse/HDFS-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su reopened HDFS-8689:
-

 move hasClusterEverBeenMultiRack to NetworkTopology
 ---

 Key: HDFS-8689
 URL: https://issues.apache.org/jira/browse/HDFS-8689
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Walter Su
Assignee: Walter Su





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8689) move hasClusterEverBeenMultiRack to NetworkTopology


 [ 
https://issues.apache.org/jira/browse/HDFS-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su resolved HDFS-8689.
-
Resolution: Invalid

 move hasClusterEverBeenMultiRack to NetworkTopology
 ---

 Key: HDFS-8689
 URL: https://issues.apache.org/jira/browse/HDFS-8689
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Walter Su
Assignee: Walter Su





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap

Yi Liu created HDFS-8988:


 Summary: Use LightWeightHashSet instead of LightWeightLinkedSet in 
BlockManager#excessReplicateMap
 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor


{code}
public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new 
HashMap();
{code}
{{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
order, but it requires more memory for each entry (2 references, totally 8 
bytes).  
We don't need to keep excess replicated blocks in order here, so should use  
{{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8987.
-
Resolution: Fixed

 Erasure coding: MapReduce job failed when I set the / folder to the EC zone 
 

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone


 [ 
https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang reopened HDFS-8987:
-

 Erasure coding: MapReduce job failed when I set the / folder to the EC zone 
 

 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang

 Test progress is as follows
  * For a new cluster, I format the namenode and then start HDFS service.
  * After HDFS service is started, there is no files in  HDFS and set the / 
 folder to the EC zone. the EC zone is created successfully.
  * Start the yarn and mr JobHistoryServer services. All the services start 
 successfully.
  * Then run hadoop example pi program and it failed.
 The following is the exception.
 {noformat}
  ```
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
  Cannot set replication to a file with striped blocks
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)
 ```
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900


[ 
https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718147#comment-14718147
 ] 

Hudson commented on HDFS-8963:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #307 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/307/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Fix incorrect sign extension of xattr length in HDFS-8900
 -

 Key: HDFS-8963
 URL: https://issues.apache.org/jira/browse/HDFS-8963
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Critical
 Fix For: 2.8.0

 Attachments: HDFS-8963.001.patch


 HDFS-8900 introduced two new findbugs warnings:
 https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8900) Compact XAttrs to optimize memory footprint.


[ 
https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718148#comment-14718148
 ] 

Hudson commented on HDFS-8900:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #307 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/307/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Compact XAttrs to optimize memory footprint.
 

 Key: HDFS-8900
 URL: https://issues.apache.org/jira/browse/HDFS-8900
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, 
 HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch


 {code}
 private final ImmutableListXAttr xAttrs;
 {code}
 Currently we use above in XAttrFeature, it's not efficient from memory point 
 of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, 
 and each object has memory alignment. 
 We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8964) Provide max TxId when validating in-progress edit log files


 [ 
https://issues.apache.org/jira/browse/HDFS-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8964:

Attachment: HDFS-8964.01.patch

Thanks Colin for taking a look. Updating the patch to better handle cases with 
no provided {{maxTxId}}.

I also found {{scanLog}} is identical to {{validateLog}} and removed it from 
two places.

{{FileJournalManager#getRemoteEditLogs}} and {{selectInputStreams}} are already 
updated to provide {{maxTxId}}. Where else do you think we are trying to read 
an active in-progress edit file?

 Provide max TxId when validating in-progress edit log files
 ---

 Key: HDFS-8964
 URL: https://issues.apache.org/jira/browse/HDFS-8964
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node, namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8964.00.patch, HDFS-8964.01.patch


 NN/JN validates in-progress edit log files in multiple scenarios, via 
 {{EditLogFile#validateLog}}. The method scans through the edit log file to 
 find the last transaction ID.
 However, an in-progress edit log file could be actively written to, which 
 creates a race condition and causes incorrect data to be read (and later we 
 attempt to interpret the data as ops).
 Currently {{validateLog}} is used in 3 places:
 # NN {{getEditsFromTxid}}
 # JN {{getEditLogManifest}}
 # NN/JN {{recoverUnfinalizedSegments}}
 In the first two scenarios we should provide a maximum TxId to validate in 
 the in-progress file. The 3rd scenario won't cause a race condition because 
 only non-current in-progress edit log files are validated.
 {{validateLog}} is actually only used with in-progress files, and could use a 
 better name and Javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8978) Erasure coding: fix 2 failed tests of DFSStripedOutputStream


[ 
https://issues.apache.org/jira/browse/HDFS-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718158#comment-14718158
 ] 

Hadoop QA commented on HDFS-8978:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 23s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 36s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  4s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  53m 52s | Tests failed in hadoop-hdfs. |
| | |  95m 19s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestParallelShortCircuitRead |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit |
|   | hadoop.hdfs.server.namenode.TestAllowFormat |
|   | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens |
|   | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.protocol.TestBlockListAsLongs |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory |
|   | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestDFSPermission |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.tools.TestGetGroups |
|   | hadoop.hdfs.TestRemoteBlockReader2 |
|   | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.TestErasureCodingZones |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
|   | hadoop.hdfs.TestRemoteBlockReader |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRename |
|   | hadoop.hdfs.TestBlockReaderLocal |
|   | hadoop.cli.TestCacheAdminCLI |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.namenode.TestNameNodeRecovery |
|   | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.fs.loadGenerator.TestLoadGenerator |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.fs.TestFcHdfsSetUMask |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.crypto.TestHdfsCryptoStreams |
|   | hadoop.fs.viewfs.TestViewFsFileStatusHdfs |
|   | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization |
|   | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl |
|   | hadoop.hdfs.server.namenode.ha.TestHAConfiguration |
|   | hadoop.hdfs.TestDFSAddressConfig |
|   | hadoop.tracing.TestTracingShortCircuitLocalRead |
|   |

[jira] [Created] (HDFS-8987) MapReduce job failed when I set the / foler to the EC zone

2015-08-28 Thread Lifeng Wang (JIRA)

Lifeng Wang created HDFS-8987:
-

 Summary: MapReduce job failed when I set the / foler to the EC 
zone 
 Key: HDFS-8987
 URL: https://issues.apache.org/jira/browse/HDFS-8987
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 3.0.0
Reporter: Lifeng Wang


Test progress is as follows
 * For a new cluster, I format the namenode and then start HDFS service.
 * After HDFS service is started, there is no files in  HDFS and set the / 
folder to the EC zone. the EC zone is created successfully.
 * Start the yarn and mr JobHistoryServer services. All the services start 
successfully.
 * Then run hadoop example pi program and it failed.

The following is the exception.
 ```
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException):
 Cannot set replication to a file with striped blocks
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165)

```





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718222#comment-14718222
 ] 

Masatake Iwasaki commented on HDFS-8946:


Thanks for working on this, [~Hitliuyi]. I read the code of 
BlockPlacementPolicyDefault for HDFS-8945 recently and comment here while my 
memory is fresh:-)

bq. Besides, no need to shuffle the storages, since we only need to check 
according to the storage type on the datanode once.

Here is my understanding of this. Please correct me if I'm wrong:
   
{{LocatedBlock}} returned by {{ClientProtocol#addBlock}}, 
{{ClientProtocol#getAdditionalDatanode}} and 
{{ClientProtocol#updateBlockForPipeline}} contains storageIDs given by 
{{BlockPlacementPolicy#chooseTarget}} but the user of these APIs (which is only 
DataStreamer) does not uses storageIDs. DataStreamer just send storage type to 
DataNode and the DataNode decides which volume to use on its own by using 
{{VolumeChoosingPolicy}}.


 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5714) Use byte array to represent UnderConstruction feature and Snapshot feature for INodeFile


[ 
https://issues.apache.org/jira/browse/HDFS-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718219#comment-14718219
 ] 

Yi Liu commented on HDFS-5714:
--

Thanks [~jingzhao].
I think this is good, can you rebase the patch?

 Use byte array to represent UnderConstruction feature and Snapshot feature 
 for INodeFile
 

 Key: HDFS-5714
 URL: https://issues.apache.org/jira/browse/HDFS-5714
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-5714.000.patch


 Currently we define specific classes to represent different INode features, 
 such as FileUnderConstructionFeature and FileWithSnapshotFeature. While 
 recording these feature information in memory, the internal information and 
 object references can still cost a lot of memory. For example, for 
 FileWithSnapshotFeature, not considering the INode's local name, the whole 
 FileDiff list (with size n) can cost around 120n bytes.
 In order to decrease the memory usage, we plan to use byte array to record 
 the UnderConstruction feature and Snapshot feature for INodeFile. 
 Specifically, if we use protobuf's encoding, the memory usage for a 
 FileWithSnapshotFeature can be less than 56n bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails

2015-08-28 Thread Li Bo (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718235#comment-14718235
]

Li Bo commented on HDFS-8704:
-

The two failed test cases are about insufficient datanodes. They also fail
without the patch. We can handle them in a separate jira.

Erasure Coding: client fails to write large file when one datanode fails

Key: HDFS-8704
URL: https://issues.apache.org/jira/browse/HDFS-8704
Project: Hadoop HDFS
Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch,
HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch,
HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch

I test current code on a 5-node cluster using RS(3,2). When a datanode is
corrupt, client succeeds to write a file smaller than a block group but fails
to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests
files smaller than a block group, this jira will add more test situations.
A streamer may encounter some bad datanodes when writing blocks allocated to
it. When it fails to connect datanode or send a packet, the streamer needs to
prepare for the next block. First it removes the packets of current block
from its data queue. If the first packet of next block has already been in
the data queue, the streamer will reset its state and start to wait for the
next block allocated for it; otherwise it will just wait for the first packet
of next block. The streamer will check periodically if it is asked to
terminate during its waiting.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap


[ 
https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718343#comment-14718343
 ] 

Hadoop QA commented on HDFS-8988:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 32s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 10s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 188m 53s | Tests failed in hadoop-hdfs. |
| | | 232m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752942/HDFS-8988.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e166c03 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12192/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12192/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12192/console |


This message was automatically generated.

 Use LightWeightHashSet instead of LightWeightLinkedSet in 
 BlockManager#excessReplicateMap
 -

 Key: HDFS-8988
 URL: https://issues.apache.org/jira/browse/HDFS-8988
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: HDFS-8988.001.patch


 {code}
 public final MapString, LightWeightLinkedSetBlock excessReplicateMap = 
 new HashMap();
 {code}
 {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in 
 order, but it requires more memory for each entry (2 references = 8 bytes).  
 We don't need to keep excess replicated blocks in order here, so should use  
 {{LightWeightHashSet}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped


 [ 
https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8501:

Attachment: HDFS-8501-HDFS-7285.01.patch

 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 -

 Key: HDFS-8501
 URL: https://issues.apache.org/jira/browse/HDFS-8501
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su
 Attachments: HDFS-8501-HDFS-7285.01.patch


 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 Assume we have a BlockInfoStriped:
 {noformat}
 triplets[] = {s0, s1, s2, s3}
 indices[] = {0, 1, 2, 3}
 {noformat}
 {{s0}} means {{storage_0}}
 When we run balancer/mover to re-locate replica on s2, firstly it becomes:
 {noformat}
 triplets[] = {s0, s1, s2, s3, s4}
 indices[] = {0, 1, 2, 3, 2}
 {noformat}
 Then the replica on s2 is removed, finally it becomes:
 {noformat}
 triplets[] = {s0, s1, null, s3, s4}
 indices[] = {0, 1, -1, 3, 2}
 {noformat}
 The worst case is:
 {noformat}
 triplets[] = {null, null, null, null, s4, s5, s6, s7}
 indices[] = {-1, -1, -1, -1, 0, 1, 2, 3}
 {noformat}
 We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a 
 storage is removed, we move the last item front.
 With the improvement, the worst case become:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null}
 indices[] = {0, 1, 2, 3, -1}
 {noformat}
 We have an empty slot.
 Notes:
 Assume we copy 4 storage first, then delete 4. Even with the improvement, the 
 worst case could be:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null, null, null, null}
 indices[] = {0, 1, 2, 3, -1, -1, -1, -1}
 {noformat}
 But the Balancer uses {{delHint}}. So when add one will always delete one. So 
 this case won't happen for striped and contiguous blocks.
 *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do 
 further improvement in HDFS-8032.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8964) Provide max TxId when validating in-progress edit log files


[ 
https://issues.apache.org/jira/browse/HDFS-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718390#comment-14718390
 ] 

Hadoop QA commented on HDFS-8964:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  7s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 22s | The applied patch generated  2 
new checkstyle issues (total was 162, now 162). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 37s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 21s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 189m 35s | Tests failed in hadoop-hdfs. |
| | | 235m 24s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.qjournal.server.TestJournal |
|   | hadoop.hdfs.server.namenode.TestFSNamesystem |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.TestFileJournalManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752950/HDFS-8964.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e166c03 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12193/console |


This message was automatically generated.

 Provide max TxId when validating in-progress edit log files
 ---

 Key: HDFS-8964
 URL: https://issues.apache.org/jira/browse/HDFS-8964
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node, namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8964.00.patch, HDFS-8964.01.patch


 NN/JN validates in-progress edit log files in multiple scenarios, via 
 {{EditLogFile#validateLog}}. The method scans through the edit log file to 
 find the last transaction ID.
 However, an in-progress edit log file could be actively written to, which 
 creates a race condition and causes incorrect data to be read (and later we 
 attempt to interpret the data as ops).
 Currently {{validateLog}} is used in 3 places:
 # NN {{getEditsFromTxid}}
 # JN {{getEditLogManifest}}
 # NN/JN {{recoverUnfinalizedSegments}}
 In the first two scenarios we should provide a maximum TxId to validate in 
 the in-progress file. The 3rd scenario won't cause a race condition because 
 only non-current in-progress edit log files are validated.
 {{validateLog}} is actually only used with in-progress files, and could use a 
 better name and Javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8946) Improve choosing datanode storage for block placement


 [ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8946:
-
Attachment: HDFS-8946.003.patch

 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
 HDFS-8946.003.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped


 [ 
https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8501:

Status: Patch Available  (was: Open)

 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 -

 Key: HDFS-8501
 URL: https://issues.apache.org/jira/browse/HDFS-8501
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su
 Attachments: HDFS-8501-HDFS-7285.01.patch


 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 Assume we have a BlockInfoStriped:
 {noformat}
 triplets[] = {s0, s1, s2, s3}
 indices[] = {0, 1, 2, 3}
 {noformat}
 {{s0}} means {{storage_0}}
 When we run balancer/mover to re-locate replica on s2, firstly it becomes:
 {noformat}
 triplets[] = {s0, s1, s2, s3, s4}
 indices[] = {0, 1, 2, 3, 2}
 {noformat}
 Then the replica on s2 is removed, finally it becomes:
 {noformat}
 triplets[] = {s0, s1, null, s3, s4}
 indices[] = {0, 1, -1, 3, 2}
 {noformat}
 The worst case is:
 {noformat}
 triplets[] = {null, null, null, null, s4, s5, s6, s7}
 indices[] = {-1, -1, -1, -1, 0, 1, 2, 3}
 {noformat}
 We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a 
 storage is removed, we move the last item front.
 With the improvement, the worst case become:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null}
 indices[] = {0, 1, 2, 3, -1}
 {noformat}
 We have an empty slot.
 Notes:
 Assume we copy 4 storage first, then delete 4. Even with the improvement, the 
 worst case could be:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null, null, null, null}
 indices[] = {0, 1, 2, 3, -1, -1, -1, -1}
 {noformat}
 But the Balancer uses {{delHint}}. So when add one will always delete one. So 
 this case won't happen for striped and contiguous blocks.
 *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do 
 further improvement in HDFS-8032.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-08-28 Thread GAO Rui (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718197#comment-14718197
]

GAO Rui commented on HDFS-7285:
---

Thank you very much [~brahmareddy], [~zhz].

{quote}
1) snapshot feature
2) balancer feature
{quote}
may will be developed in future EC work, we could add these into the system
test plan, and implement the test later.
{quote}
4) parallel writes
5) parallel reads
{quote}
I think {{parallel reads}} means more than one client try to read the same EC
file form HDFS, right? What is {{parallel writes}} refer to, in EC system
testing? Could you explain the scenario?

{quote}
1. Good points from Brahma Reddy Battula, I suggest that we also add HSM/mover
tests to the list.
2. In reading tests we can distinguish stateful read and pread. Maybe we should
test seek-and-read scenario too.
3. It seems each test scenario in the Tips for EC Writing/Reading section is
systematically labeled. Will the labels be used to drive automatic testing?
{quote}

We can also add {{HSM/mover}} to the test plan, and implement it in future
work.

For the reading distinguish, we currently implement system test by using
FSShell command in terminal, like {{CopyFromLocal}} and {{CopyToLocal}}. Can we
set the client to read EC file in particular mechanism like stateful read and
pread by terminal command?

The labels in EC Writing/Reading tests were generated by test script during the
test process, but it is also possible to drive automatic testing by the
scenario labels vice versa.

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: Compare-consolidated-20150824.diff,
Consolidated-20150707.patch, Consolidated-20150806.patch,
Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py,
HDFS-7285-initial-PoC.patch, HDFS-7285-merge-consolidated-01.patch,
HDFS-7285-merge-consolidated-trunk-01.patch,
HDFS-7285-merge-consolidated.trunk.03.patch,
HDFS-7285-merge-consolidated.trunk.04.patch,
HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch,
HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf,
HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf,
HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf,
HDFSErasureCodingSystemTestPlan-20150824.pdf,
HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf

Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice
of data reliability, comparing to the existing HDFS 3-replica approach. For
example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks,
with storage overhead only being 40%. This makes EC a quite attractive
alternative for big data storage, particularly for cold data.
Facebook had a related open source project called HDFS-RAID. It used to be
one of the contribute packages in HDFS but had been removed since Hadoop 2.0
for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends
on MapReduce to do encoding and decoding tasks; 2) it can only be used for
cold files that are intended not to be appended anymore; 3) the pure Java EC
coding implementation is extremely slow in practical use. Due to these, it
might not be a good idea to just bring HDFS-RAID back.
We (Intel and Cloudera) are working on a design to build EC into HDFS that
gets rid of any external dependencies, makes it self-contained and
independently maintained. This design lays the EC feature on the storage type
support and considers compatible with existing HDFS features like caching,
snapshot, encryption, high availability and etc. This design will also
support different EC coding schemes, implementations and policies for
different deployment scenarios. By utilizing advanced libraries (e.g. Intel
ISA-L library), an implementation can greatly improve the performance of EC
encoding/decoding and makes the EC solution even more attractive. We will
post the design document soon.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718232#comment-14718232
 ] 

Yi Liu commented on HDFS-8946:
--

Thanks Masatake for the comments.
{quote}
LocatedBlock returned by ClientProtocol#addBlock, 
ClientProtocol#getAdditionalDatanode and ClientProtocol#updateBlockForPipeline 
contains storageIDs given by BlockPlacementPolicy#chooseTarget but the user of 
these APIs (which is only DataStreamer) does not uses storageIDs. DataStreamer 
just send storage type to DataNode and the DataNode decides which volume to use 
on its own by using VolumeChoosingPolicy.
{quote}
Yes, {{storageIDs}} is not used. 

{quote}
Any reason to change the logic of remaining size checking?
{quote}
Nice find, It's my fault to forget to add {{blockSize *}} while coping this 
logic from original {{isGoodTarget}}

 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900


[ 
https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718199#comment-14718199
 ] 

Hudson commented on HDFS-8963:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2245 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2245/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java


 Fix incorrect sign extension of xattr length in HDFS-8900
 -

 Key: HDFS-8963
 URL: https://issues.apache.org/jira/browse/HDFS-8963
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Haohui Mai
Assignee: Colin Patrick McCabe
Priority: Critical
 Fix For: 2.8.0

 Attachments: HDFS-8963.001.patch


 HDFS-8900 introduced two new findbugs warnings:
 https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8900) Compact XAttrs to optimize memory footprint.


[ 
https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718200#comment-14718200
 ] 

Hudson commented on HDFS-8900:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2245 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2245/])
HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin 
Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java


 Compact XAttrs to optimize memory footprint.
 

 Key: HDFS-8900
 URL: https://issues.apache.org/jira/browse/HDFS-8900
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, 
 HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch


 {code}
 private final ImmutableListXAttr xAttrs;
 {code}
 Currently we use above in XAttrFeature, it's not efficient from memory point 
 of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, 
 and each object has memory alignment. 
 We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718234#comment-14718234
 ] 

Yi Liu commented on HDFS-8946:
--

Will update the patch later.

 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8973) NameNode exit without any exception log

2015-08-28 Thread kanaka kumar avvaru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718325#comment-14718325
 ] 

kanaka kumar avvaru commented on HDFS-8973:
---

Regarding logs not printed, looks like log4j produces [only first error in an 
appender  by default 
|http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/helpers/OnlyOnceErrorHandler.html]
 and doesn't have recovery of the log file.
So, its recommended to configure {{FallbackErrorHandler}} or some other 
alternative method to ensure logs are not missed.

Regarding process Exit, we are missing something about the cause. Even after 
log4j error, system has functioned well for some time. The actual reason is may 
not be visible as logs are not present.

{quote}. it seems cause by log4j ERROR.{quote}
IMO we can't conclude this is the reason for process exit as NN looks 
functioning sometime after this message also.

 NameNode exit without any exception log
 ---

 Key: HDFS-8973
 URL: https://issues.apache.org/jira/browse/HDFS-8973
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: He Xiaoqiao
Priority: Critical

 namenode process exit without any useful WARN/ERROR log, and after .log file 
 output interrupt .out file continue show about 5 min GC log. when .log file 
 intertupt .out file print the follow ERROR, it may hint some info. it seems 
 cause by log4j ERROR.
 {code:title=namenode.out|borderStyle=solid}
 log4j:ERROR Failed to flush writer,
 java.io.IOException: 错误的文件描述符
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:318)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
 at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
 at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
 at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
 at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
 at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
 at 
 org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
 at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
 at 
 org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
 at 
 org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
 at org.apache.log4j.Category.callAppenders(Category.java:206)
 at org.apache.log4j.Category.forcedLog(Category.java:391)
 at org.apache.log4j.Category.log(Category.java:856)
 at 
 org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
 at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
 at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS

2015-08-28 Thread GAO Rui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GAO Rui updated HDFS-7285:
--
Attachment: HDFSErasureCodingSystemTestReport-20150826.pdf

Based on latest version of branch HDFS-7285, we implemented system test 
according to the test plan [^HDFSErasureCodingSystemTestPlan-20150824.pdf]. We 
failed to test some scenarios in EC file writing/reading test case because of 
problems which is not related to HDFS, but ssh issues of the test script. We 
will figure out the problem, and implement remaining test scenarios ASAP. 
Thanks [~jingzhao] and [~szetszwo]] for help [~tfukudom] and our team during 
the whole process of test planning and implementation. 

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: Compare-consolidated-20150824.diff, 
 Consolidated-20150707.patch, Consolidated-20150806.patch, 
 Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, 
 HDFS-7285-initial-PoC.patch, HDFS-7285-merge-consolidated-01.patch, 
 HDFS-7285-merge-consolidated-trunk-01.patch, 
 HDFS-7285-merge-consolidated.trunk.03.patch, 
 HDFS-7285-merge-consolidated.trunk.04.patch, 
 HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, 
 HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, 
 HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, 
 HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, 
 HDFSErasureCodingSystemTestPlan-20150824.pdf, 
 HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718226#comment-14718226
 ] 

Masatake Iwasaki commented on HDFS-8946:


DatanodeDescriptor.java
{code}
679 final long requiredSize =
680 blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
681 final long scheduledSize = getBlocksScheduled(t);
682 long remaining = 0;
683 DatanodeStorageInfo storage = null;
684 for (DatanodeStorageInfo s : getStorageInfos()) {
685   if (s.getState() == State.NORMAL 
686   s.getStorageType() == t) {
687 if (storage == null) {
688   storage = s;
689 }
690 long r = s.getRemaining();
691 if (r = requiredSize) {
692   remaining += r;
693 }
694   }
695 }
696 if (requiredSize  remaining - scheduledSize) {
697   return null;
{code}

{{scheduledSize}} is number of blocks but used as if it's bytes.

Any reason to change the logic of remaining size checking?


 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too

2015-08-28 Thread kanaka kumar avvaru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718308#comment-14718308
 ] 

kanaka kumar avvaru commented on HDFS-8892:
---

Hi [~Ravikumar], are you planning to update code and produce patch as per 
[~cmccabe] suggestion? If yes, feel free to assign the JIRA to you. 
Otherwise, I will create the patch.

 ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
 -

 Key: HDFS-8892
 URL: https://issues.apache.org/jira/browse/HDFS-8892
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.7.1
Reporter: Ravikumar
Assignee: kanaka kumar avvaru
Priority: Minor

 Currently CacheCleaner thread checks only for cache-expiry times. It would be 
 nice if it handles an invalid-slot too in an extra-pass of evictable map…
 for(ShortCircuitReplica replica:evictable.values()) {
  if(!scr.getSlot().isValid()) {
 purge(replica);
  }
 }
 //Existing code...
 int numDemoted = demoteOldEvictableMmaped(curMs);
 int numPurged = 0;
 Long evictionTimeNs = Long.valueOf(0);
 ….
 …..
 Apps like HBase can tweak the expiry/staleness/cache-size params in 
 DFS-Client, so that ShortCircuitReplica will never be closed except when Slot 
 is declared invalid. 
 I assume slot-invalidation will happen during block-invalidation/deletes 
 {Primarily triggered by compaction/shard-takeover etc..}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8967) Create a BlockManagerLock class to represent the lock used in the BlockManager

2015-08-28 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718588#comment-14718588
 ] 

Daryn Sharp commented on HDFS-8967:
---

This mimics a subset of what I've been working on, so I'm ok with it after 
pre-commit succeeds  I'll re-review.

 Create a BlockManagerLock class to represent the lock used in the BlockManager
 --

 Key: HDFS-8967
 URL: https://issues.apache.org/jira/browse/HDFS-8967
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8967.000.patch, HDFS-8967.001.patch


 This jira proposes to create a {{BlockManagerLock}} class to represent the 
 lock used in {{BlockManager}}.
 Currently it directly points to the {{FSNamesystem}} lock thus there are no 
 functionality changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones

[
https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718396#comment-14718396
]

Walter Su commented on HDFS-8833:
-

...encode replication and ecPolicy together (Zhe Zhang)
Good Thought Zhe!

... Well it depends on how small (in relative to cell size). We should
certainly skip files smaller than a full stripe. (Zhe Zhang)
Yes. cellSize is relavant.

...I find the above usecase very compelling, which is why I've been advocating
for using the file header bits. I haven't seen much competition for the bits
either, and we can also start conservatively when using bits (only as many as
we need). (Andrew Wang)
Agree.

So, have we reached a consensus? Have other different thoughts, guys?

Erasure coding: store EC schema and cell size in INodeFile and eliminate
notion of EC zones
---

Key: HDFS-8833
URL: https://issues.apache.org/jira/browse/HDFS-8833
Project: Hadoop HDFS
Issue Type: Sub-task
Components: namenode
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Attachments: HDFS-8833-HDFS-7285-merge.00.patch,
HDFS-8833-HDFS-7285-merge.01.patch, HDFS-8833-HDFS-7285.02.patch

We have [discussed |
https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754]
storing EC schema with files instead of EC zones and recently revisited the
discussion under HDFS-8059.
As a recap, the _zone_ concept has severe limitations including renaming and
nested configuration. Those limitations are valid in encryption for security
reasons and it doesn't make sense to carry them over in EC.
This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For
simplicity, we should first implement it as an xattr and consider memory
optimizations (such as moving it to file header) as a follow-on. We should
also disable changing EC policy on a non-empty file / dir in the first phase.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8950) NameNode refresh doesn't remove DataNodes

2015-08-28 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-8950:
---
Attachment: HDFS-8950.005.patch

Tests are all passing.  Again.

 NameNode refresh doesn't remove DataNodes
 -

 Key: HDFS-8950
 URL: https://issues.apache.org/jira/browse/HDFS-8950
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
 Fix For: 2.8.0

 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
 HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch


 If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
 refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
 It may try to allocate some blocks to that DN as well during an MR job.  This 
 issue is independent from DN decommission.
 To reproduce:
 1. Add a DN to dfs_hosts_allow
 2. Refresh NN
 3. Start DN. Now NN starts seeing DN.
 4. Stop DN
 5. Remove DN from dfs_hosts_allow
 6. Refresh NN - NN is still reporting DN as being used by HDFS.
 This is different from decom because there DN is added to exclude list in 
 addition to being removed from allowed list, and in that case everything 
 works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped


[ 
https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14719964#comment-14719964
 ] 

Hadoop QA commented on HDFS-8501:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 39s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 41s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  6s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 184m 49s | Tests failed in hadoop-hdfs. |
| | | 226m 55s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.TestWriteStripedFileWithFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752982/HDFS-8501-HDFS-7285.01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 164cbe6 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12195/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12195/console |


This message was automatically generated.

 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 -

 Key: HDFS-8501
 URL: https://issues.apache.org/jira/browse/HDFS-8501
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su
 Attachments: HDFS-8501-HDFS-7285.01.patch


 Erasure Coding: Improve memory efficiency of BlockInfoStriped
 Assume we have a BlockInfoStriped:
 {noformat}
 triplets[] = {s0, s1, s2, s3}
 indices[] = {0, 1, 2, 3}
 {noformat}
 {{s0}} means {{storage_0}}
 When we run balancer/mover to re-locate replica on s2, firstly it becomes:
 {noformat}
 triplets[] = {s0, s1, s2, s3, s4}
 indices[] = {0, 1, 2, 3, 2}
 {noformat}
 Then the replica on s2 is removed, finally it becomes:
 {noformat}
 triplets[] = {s0, s1, null, s3, s4}
 indices[] = {0, 1, -1, 3, 2}
 {noformat}
 The worst case is:
 {noformat}
 triplets[] = {null, null, null, null, s4, s5, s6, s7}
 indices[] = {-1, -1, -1, -1, 0, 1, 2, 3}
 {noformat}
 We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a 
 storage is removed, we move the last item front.
 With the improvement, the worst case become:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null}
 indices[] = {0, 1, 2, 3, -1}
 {noformat}
 We have an empty slot.
 Notes:
 Assume we copy 4 storage first, then delete 4. Even with the improvement, the 
 worst case could be:
 {noformat}
 triplets[] = {s4, s5, s6, s7, null, null, null, null}
 indices[] = {0, 1, 2, 3, -1, -1, -1, -1}
 {noformat}
 But the Balancer uses {{delHint}}. So when add one will always delete one. So 
 this case won't happen for striped and contiguous blocks.
 *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do 
 further improvement in HDFS-8032.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs

2015-08-28 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-2070:
---
Status: Patch Available  (was: In Progress)

Ready for review.  This should be a quick and easy one.

 A lack of auto-test for FsShell getmerge in hdfs
 

 Key: HDFS-2070
 URL: https://issues.apache.org/jira/browse/HDFS-2070
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0
Reporter: XieXianshan
Assignee: Daniel Templeton
  Labels: newbie
 Attachments: HDFS-2070.001.patch


 There is no any automated test for FsShell getmerge in hdfs.
 In regard to reliability and reuse,some automated tests should be added to 
 the test set.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14719294#comment-14719294
 ] 

Hadoop QA commented on HDFS-8946:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 20s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 21s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 185m 32s | Tests failed in hadoop-hdfs. |
| | | 231m 51s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752975/HDFS-8946.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e166c03 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/console |


This message was automatically generated.

 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
 HDFS-8946.003.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs

2015-08-28 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-2070:
---
Attachment: HDFS-2070.001.patch

Turns out there was already one getmerge test in the testHDFSConf.xml, but this 
patch adds a few more to cover all the bases.

 A lack of auto-test for FsShell getmerge in hdfs
 

 Key: HDFS-2070
 URL: https://issues.apache.org/jira/browse/HDFS-2070
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0
Reporter: XieXianshan
Assignee: Daniel Templeton
  Labels: newbie
 Attachments: HDFS-2070.001.patch


 There is no any automated test for FsShell getmerge in hdfs.
 In regard to reliability and reuse,some automated tests should be added to 
 the test set.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8925) Move BlockReaderLocal to hdfs-client


[ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720929#comment-14720929
 ] 

Hudson commented on HDFS-8925:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2249 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2249/])
HDFS-8925. Move BlockReaderLocal to hdfs-client. Contributed by Mingliang Liu. 
(wheat9: rev e2c9b288b223b9fd82dc12018936e13128413492)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java
*

[jira] [Commented] (HDFS-8983) NameNode support for protected directories


[ 
https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720913#comment-14720913
 ] 

Hadoop QA commented on HDFS-8983:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 32s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 43s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 34s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 171m  1s | Tests failed in hadoop-hdfs. |
| | | 238m 49s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.web.TestWebHDFSForHA |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestFsck |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12753080/HDFS-8983.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e2c9b28 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12206/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12206/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12206/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12206/console |


This message was automatically generated.

 NameNode support for protected directories
 --

 Key: HDFS-8983
 URL: https://issues.apache.org/jira/browse/HDFS-8983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, 
 HDFS-8983.03.patch


 To protect important system directories from inadvertent deletion (e.g. 
 /Users) the NameNode can allow marking directories as _protected_. Such 
 directories cannot be deleted unless they are empty. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors


[ 
https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720907#comment-14720907
 ] 

Hadoop QA commented on HDFS-8965:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  7s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 33s | The applied patch generated  7 
new checkstyle issues (total was 401, now 402). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 26s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 146m 58s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   3m 56s | Tests passed in bkjournal. |
| | | 198m 31s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12753081/HDFS-8965.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e2c9b28 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| bkjournal test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/testrun_bkjournal.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12207/console |


This message was automatically generated.

 Harden edit log reading code against out of memory errors
 -

 Key: HDFS-8965
 URL: https://issues.apache.org/jira/browse/HDFS-8965
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, 
 HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch


 We should harden the edit log reading code against out of memory errors.  Now 
 that each op has a length prefix and a checksum, we can validate the checksum 
 before trying to load the Op data.  This should avoid out of memory errors 
 when trying to load garbage data as Op data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement


[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720899#comment-14720899
 ] 

Yi Liu commented on HDFS-8946:
--

The test failure is not related, they run successfully in my local env. 

 Improve choosing datanode storage for block placement
 -

 Key: HDFS-8946
 URL: https://issues.apache.org/jira/browse/HDFS-8946
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
 HDFS-8946.003.patch


 This JIRA is to:
 Improve chooseing datanode storage for block placement:
 In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
 {{chooseRandom}}), we have following logic to choose datanode storage to 
 place block.
 For given storage type, we iterate storages of the datanode. But for 
 datanode, it only cares about the storage type. In the loop, we check 
 according to Storage type and return the first storage if the storages of the 
 type on the datanode fit in requirement. So we can remove the iteration of 
 storages, and just need to do once to find a good storage of given type, it's 
 efficient if the storages of the type on the datanode don't fit in 
 requirement since we don't need to loop all storages and do the same check.
 Besides, no need to shuffle the storages, since we only need to check 
 according to the storage type on the datanode once.
 This also improves the logic and make it more clear.
 {code}
   if (excludedNodes.add(localMachine) // was not in the excluded list
isGoodDatanode(localDatanode, maxNodesPerRack, false,
   results, avoidStaleNodes)) {
 for (IteratorMap.EntryStorageType, Integer iter = storageTypes
 .entrySet().iterator(); iter.hasNext(); ) {
   Map.EntryStorageType, Integer entry = iter.next();
   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
   localDatanode.getStorageInfos())) {
 StorageType type = entry.getKey();
 if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
 results, type) = 0) {
   int num = entry.getValue();
   ...
 {code}
 (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8983) NameNode support for protected directories

2015-08-28 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720967#comment-14720967
 ] 

Jitendra Nath Pandey commented on HDFS-8983:


Thanks for addressing my comments [~arpitagarwal].
+1 for the patch.

 NameNode support for protected directories
 --

 Key: HDFS-8983
 URL: https://issues.apache.org/jira/browse/HDFS-8983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, 
 HDFS-8983.03.patch, HDFS-8983.04.patch


 To protect important system directories from inadvertent deletion (e.g. 
 /Users) the NameNode can allow marking directories as _protected_. Such 
 directories cannot be deleted unless they are empty. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8978) Erasure coding: fix 2 failed tests of DFSStripedOutputStream


[ 
https://issues.apache.org/jira/browse/HDFS-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720954#comment-14720954
 ] 

Hadoop QA commented on HDFS-8978:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 14s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 43s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 101m 26s | Tests failed in hadoop-hdfs. |
| | | 144m 40s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Timed out tests | org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached 
|
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752917/HDFS-8978-HDFS-7285.02.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 164cbe6 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12208/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12208/console |


This message was automatically generated.

 Erasure coding: fix 2 failed tests of DFSStripedOutputStream
 

 Key: HDFS-8978
 URL: https://issues.apache.org/jira/browse/HDFS-8978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8978-HDFS-7285.01.patch, 
 HDFS-8978-HDFS-7285.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8983) NameNode support for protected directories


[ 
https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720982#comment-14720982
 ] 

Hadoop QA commented on HDFS-8983:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 59s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 21s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 20s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 111m  1s | Tests failed in hadoop-hdfs. |
| | | 180m 54s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
| Timed out tests | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12753112/HDFS-8983.04.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e2c9b28 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12210/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12210/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12210/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12210/console |


This message was automatically generated.

 NameNode support for protected directories
 --

 Key: HDFS-8983
 URL: https://issues.apache.org/jira/browse/HDFS-8983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, 
 HDFS-8983.03.patch, HDFS-8983.04.patch


 To protect important system directories from inadvertent deletion (e.g. 
 /Users) the NameNode can allow marking directories as _protected_. Such 
 directories cannot be deleted unless they are empty. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8983) NameNode support for protected directories

2015-08-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8983:

Attachment: HDFS-8983.04.patch

Thanks [~jnp]! Addressed in .04 patch. Also updated 
{{FsDirectory#normalizePaths}} with more checks.

Delta chunks .03 to .04.
{code}
-  checkProtectedDescendants(fsd, src.endsWith(Path.SEPARATOR) ?
-  src.substring(0, src.length() - 1) : src);
+  checkProtectedDescendants(fsd, fsd.normalizePath(src));
{code}

{code}
-// {@link Path#SEPARATOR} is /.
+// {@link Path#SEPARATOR} is / and '0' is the next ASCII
+// character after '/'.
{code}

{code}
-   * Reserved paths are ignored.
+   * Reserved paths, relative paths and paths with scheme are ignored.
{code}

{code}
-final CollectionString normalized = new ArrayListString(paths.size());
-for (String path : paths) {
-  if (isReservedName(path)) {
-LOG.error({} ignoring reserved path {}, errorString, path);
+final CollectionString normalized = new ArrayList(paths.size());
+for (String dir : paths) {
+  if (isReservedName(dir)) {
+LOG.error({} ignoring reserved path {}, errorString, dir);
   } else {
-normalized.add(normalizePath(path));
+final Path path = new Path(dir);
+if (!path.isAbsolute()) {
+  LOG.error({} ignoring relative path {}, errorString, dir);
+} else if (path.toUri().getScheme() != null) {
+  LOG.error({} ignoring path {} with scheme, errorString, dir);
+} else {
+  normalized.add(path.toString());
+}
{code}

And three new tests for the additional checks in {{normalizePaths}}.

 NameNode support for protected directories
 --

 Key: HDFS-8983
 URL: https://issues.apache.org/jira/browse/HDFS-8983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, 
 HDFS-8983.03.patch, HDFS-8983.04.patch


 To protect important system directories from inadvertent deletion (e.g. 
 /Users) the NameNode can allow marking directories as _protected_. Such 
 directories cannot be deleted unless they are empty. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-08-28 Thread Bob Hansen (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720070#comment-14720070
]

Bob Hansen commented on HDFS-8855:
--

Xiaobing - is there a race condition in initializing the ugiCache? If two
threads make simultaneous requests, one of them will succeed in the CAS for
ugiCacheInit, and the other will proceed ahead. If the latter thread
immediately tries to reference the ugiCache while the first is still
initializing it, we will get an NPE or a partially-constructed object.

See
http://www.journaldev.com/1377/java-singleton-design-pattern-best-practices-with-examples
for a nice little discussion of idiomatic singletons in Java; if we're
supporting JRE = 1.5, the Pugh construction is clean and works well for
concurrent access.

Webhdfs client leaks active NameNode connections

Key: HDFS-8855
URL: https://issues.apache.org/jira/browse/HDFS-8855
Project: Hadoop HDFS
Issue Type: Bug
Components: webhdfs
Environment: HDP 2.2
Reporter: Bob Hansen
Assignee: Xiaobing Zhou
Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch,
HDFS_8855.prototype.patch

The attached script simulates a process opening ~50 files via webhdfs and
performing random reads. Note that there are at most 50 concurrent reads,
and all webhdfs sessions are kept open. Each read is ~64k at a random
position.
The script periodically (once per second) shells into the NameNode and
produces a summary of the socket states. For my test cluster with 5 nodes,
it took ~30 seconds for the NameNode to have ~25000 active connections and
fails.
It appears that each request to the webhdfs client is opening a new
connection to the NameNode and keeping it open after the request is complete.
If the process continues to run, eventually (~30-60 seconds), all of the
open connections are closed and the NameNode recovers.
This smells like SoftReference reaping. Are we using SoftReferences in the
webhdfs client to cache NameNode connections but never re-using them?

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager


 [ 
https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8938:
-
Status: Patch Available  (was: Reopened)

 Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from 
 BlockManager
 --

 Key: HDFS-8938
 URL: https://issues.apache.org/jira/browse/HDFS-8938
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, 
 HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, 
 HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, 
 HDFS-8938.008.patch


 This jira proposes to refactor two inner static classes, 
 {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to 
 standalone classes. The refactor also improves readability by abstracting the 
 complexity of scheduling and validating replications to corresponding helper 
 methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes


[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720296#comment-14720296
 ] 

Hadoop QA commented on HDFS-8950:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 30s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 21s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 161m 37s | Tests passed in hadoop-hdfs. 
|
| | | 206m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752995/HDFS-8950.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / beb65c9 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12196/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12196/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12196/console |


This message was automatically generated.

 NameNode refresh doesn't remove DataNodes
 -

 Key: HDFS-8950
 URL: https://issues.apache.org/jira/browse/HDFS-8950
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
 Fix For: 2.8.0

 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
 HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch


 If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
 refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
 It may try to allocate some blocks to that DN as well during an MR job.  This 
 issue is independent from DN decommission.
 To reproduce:
 1. Add a DN to dfs_hosts_allow
 2. Refresh NN
 3. Start DN. Now NN starts seeing DN.
 4. Stop DN
 5. Remove DN from dfs_hosts_allow
 6. Refresh NN - NN is still reporting DN as being used by HDFS.
 This is different from decom because there DN is added to exclude list in 
 addition to being removed from allowed list, and in that case everything 
 works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8155) Support OAuth2 in WebHDFS

2015-08-28 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720043#comment-14720043
 ] 

Chris Nauroth commented on HDFS-8155:
-

{code}
  client.setConnectTimeout(URLConnectionFactory.DEFAULT_SOCKET_TIMEOUT,
  TimeUnit.MILLISECONDS);
{code}

Sorry if my earlier comment was unclear, but I think we need to call both 
{{client.setConnectTimeout}} and {{client.setReadTimeout}}.  Otherwise, we 
could have a successful connection, but then hang indefinitely on a 
non-responsive server.

+1 after that.

I don't know what happened with that last Jenkins run.  It's building fine for 
me locally.

 Support OAuth2 in WebHDFS
 -

 Key: HDFS-8155
 URL: https://issues.apache.org/jira/browse/HDFS-8155
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Reporter: Jakob Homan
Assignee: Jakob Homan
 Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, 
 HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch


 WebHDFS should be able to accept OAuth2 credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too

2015-08-28 Thread Ravikumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720116#comment-14720116
 ] 

Ravikumar commented on HDFS-8892:
-

Please feel free to contribute the patch [~kanaka]. I am not currently looking 
to submit it,

 ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
 -

 Key: HDFS-8892
 URL: https://issues.apache.org/jira/browse/HDFS-8892
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.7.1
Reporter: Ravikumar
Assignee: kanaka kumar avvaru
Priority: Minor

 Currently CacheCleaner thread checks only for cache-expiry times. It would be 
 nice if it handles an invalid-slot too in an extra-pass of evictable map…
 for(ShortCircuitReplica replica:evictable.values()) {
  if(!scr.getSlot().isValid()) {
 purge(replica);
  }
 }
 //Existing code...
 int numDemoted = demoteOldEvictableMmaped(curMs);
 int numPurged = 0;
 Long evictionTimeNs = Long.valueOf(0);
 ….
 …..
 Apps like HBase can tweak the expiry/staleness/cache-size params in 
 DFS-Client, so that ShortCircuitReplica will never be closed except when Slot 
 is declared invalid. 
 I assume slot-invalidation will happen during block-invalidation/deletes 
 {Primarily triggered by compaction/shard-takeover etc..}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS

2015-08-28 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8155:
--
Attachment: HDFS-8155.006.patch

Fixed ChrisN's final coment.  Yeah, Jenkins is being weird for me.  I ran 
through all the HDFS tests manually and except for a couple non-repeatable, 
unrelated failures, everything passed.  I'll let Jenkins run again, but unless 
it's something real, I'll go ahead and commit this later today.  Thanks.

 Support OAuth2 in WebHDFS
 -

 Key: HDFS-8155
 URL: https://issues.apache.org/jira/browse/HDFS-8155
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Reporter: Jakob Homan
Assignee: Jakob Homan
 Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, 
 HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, 
 HDFS-8155.006.patch


 WebHDFS should be able to accept OAuth2 credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS

2015-08-28 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8155:
--
Status: Open  (was: Patch Available)

 Support OAuth2 in WebHDFS
 -

 Key: HDFS-8155
 URL: https://issues.apache.org/jira/browse/HDFS-8155
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Reporter: Jakob Homan
Assignee: Jakob Homan
 Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, 
 HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, 
 HDFS-8155.006.patch


 WebHDFS should be able to accept OAuth2 credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS

2015-08-28 Thread Jakob Homan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8155:
--
Status: Patch Available  (was: Open)

 Support OAuth2 in WebHDFS
 -

 Key: HDFS-8155
 URL: https://issues.apache.org/jira/browse/HDFS-8155
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Reporter: Jakob Homan
Assignee: Jakob Homan
 Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, 
 HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, 
 HDFS-8155.006.patch


 WebHDFS should be able to accept OAuth2 credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs


[ 
https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720417#comment-14720417
 ] 

Hadoop QA commented on HDFS-2070:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 30s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 26s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   1m  2s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 163m 14s | Tests failed in hadoop-hdfs. |
| | | 180m  4s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12753015/HDFS-2070.001.patch |
| Optional Tests | javac unit |
| git revision | trunk / beb65c9 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12197/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12197/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12197/console |


This message was automatically generated.

 A lack of auto-test for FsShell getmerge in hdfs
 

 Key: HDFS-2070
 URL: https://issues.apache.org/jira/browse/HDFS-2070
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0
Reporter: XieXianshan
Assignee: Daniel Templeton
  Labels: newbie
 Attachments: HDFS-2070.001.patch


 There is no any automated test for FsShell getmerge in hdfs.
 In regard to reliability and reuse,some automated tests should be added to 
 the test set.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8865) Improve quota initialization performance

2015-08-28 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720344#comment-14720344
 ] 

Daryn Sharp commented on HDFS-8865:
---

+1  This has made a huge difference, and all the possible style warning were 
addressed.

 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8865) Improve quota initialization performance


[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720363#comment-14720363
 ] 

Kihwal Lee commented on HDFS-8865:
--

Thanks [~xyao] and [~daryn] for reviews. I've committed this to trunk and 
branch-2.

 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8865) Improve quota initialization performance


 [ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-8865.
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   3.0.0

 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 3.0.0, 2.8.0

 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8865) Improve quota initialization performance


 [ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8865:
-
Hadoop Flags: Reviewed

 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 3.0.0, 2.8.0

 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8865) Improve quota initialization performance


[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720368#comment-14720368
 ] 

Hudson commented on HDFS-8865:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8364 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8364/])
HDFS-8865. Improve quota initialization performance. Contributed by Kihwal Lee. 
(kihwal: rev b6ceee9bf42eec15891f60a014bbfa47e03f563c)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/QuotaCounts.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDiskspaceQuotaUpdate.java


 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 3.0.0, 2.8.0

 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8865) Improve quota initialization performance


 [ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8865:
-
Status: In Progress  (was: Patch Available)

 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart


[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720370#comment-14720370
 ] 

Kihwal Lee commented on HDFS-8879:
--

Cherry-picked the fix to branch-2.7.

 Quota by storage type usage incorrectly initialized upon namenode restart
 -

 Key: HDFS-8879
 URL: https://issues.apache.org/jira/browse/HDFS-8879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Kihwal Lee
Assignee: Xiaoyu Yao
 Fix For: 3.0.0, 2.7.2

 Attachments: HDFS-8879.01.patch


 This was found by [~kihwal] as part of HDFS-8865 work in this 
 [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
 The unit test 
 testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
  failed to detect this because they were using an obsolete
 FsDirectory instance. Once added the highlighted line below, the issue can be 
 reproed.
 {code}
 fsdir = cluster.getNamesystem().getFSDirectory();
 INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart


 [ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8879:
-
Fix Version/s: (was: 2.8.0)
   2.7.2
   3.0.0

 Quota by storage type usage incorrectly initialized upon namenode restart
 -

 Key: HDFS-8879
 URL: https://issues.apache.org/jira/browse/HDFS-8879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Kihwal Lee
Assignee: Xiaoyu Yao
 Fix For: 3.0.0, 2.7.2

 Attachments: HDFS-8879.01.patch


 This was found by [~kihwal] as part of HDFS-8865 work in this 
 [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
 The unit test 
 testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
  failed to detect this because they were using an obsolete
 FsDirectory instance. Once added the highlighted line below, the issue can be 
 reproed.
 {code}
 fsdir = cluster.getNamesystem().getFSDirectory();
 INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID


[ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720375#comment-14720375
 ] 

Haohui Mai commented on HDFS-8779:
--

-1 on the 03 patch.

For 04 patch the javascript needs to be minimized. Or it might make sense to 
create a browsified version of our own.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, 
 patch-to-json-parse.txt


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of -number-(*integer*) in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID


[ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720399#comment-14720399
 ] 

Haohui Mai commented on HDFS-8779:
--

I'll upload a patch to demonstrate the idea later today.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, 
 patch-to-json-parse.txt


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of -number-(*integer*) in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager


[ 
https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720750#comment-14720750
 ] 

Hudson commented on HDFS-8938:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #318 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/318/])
HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes 
from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 
6d12cd8d609dec26d44cece9937c35b7d72a3cd1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from 
 BlockManager
 --

 Key: HDFS-8938
 URL: https://issues.apache.org/jira/browse/HDFS-8938
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0

 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, 
 HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, 
 HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, 
 HDFS-8938.008.patch


 This jira proposes to refactor two inner static classes, 
 {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to 
 standalone classes. The refactor also improves readability by abstracting the 
 complexity of scheduling and validating replications to corresponding helper 
 methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list


[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720768#comment-14720768
 ] 

Hudson commented on HDFS-8950:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/])
HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the 
allowed list (Daniel Templeton) (cmccabe: rev 
b94b56806d3d6e04984e229b479f7ac15b62bbfa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java


 NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
 list
 

 Key: HDFS-8950
 URL: https://issues.apache.org/jira/browse/HDFS-8950
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
 Fix For: 2.8.0

 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
 HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch


 If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
 refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
 It may try to allocate some blocks to that DN as well during an MR job.  This 
 issue is independent from DN decommission.
 To reproduce:
 1. Add a DN to dfs_hosts_allow
 2. Refresh NN
 3. Start DN. Now NN starts seeing DN.
 4. Stop DN
 5. Remove DN from dfs_hosts_allow
 6. Refresh NN - NN is still reporting DN as being used by HDFS.
 This is different from decom because there DN is added to exclude list in 
 addition to being removed from allowed list, and in that case everything 
 works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager


[ 
https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720770#comment-14720770
 ] 

Hudson commented on HDFS-8938:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/])
HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes 
from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 
6d12cd8d609dec26d44cece9937c35b7d72a3cd1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from 
 BlockManager
 --

 Key: HDFS-8938
 URL: https://issues.apache.org/jira/browse/HDFS-8938
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0

 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, 
 HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, 
 HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, 
 HDFS-8938.008.patch


 This jira proposes to refactor two inner static classes, 
 {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to 
 standalone classes. The refactor also improves readability by abstracting the 
 complexity of scheduling and validating replications to corresponding helper 
 methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8925) Move BlockReaderLocal to hdfs-client


[ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720769#comment-14720769
 ] 

Hudson commented on HDFS-8925:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/])
HDFS-8925. Move BlockReaderLocal to hdfs-client. Contributed by Mingliang Liu. 
(wheat9: rev e2c9b288b223b9fd82dc12018936e13128413492)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/PeerCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
*

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Attachment: HDFS-8990.000.patch

The v0 patch moves the {{RemoteBlockReader}} and {{RemoteBlockReader2}} classes 
to the client module.

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Attachments: HDFS-8990.000.patch


 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in 
 jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].
 The extant checkstyle warnings can be fixed in 
 [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
 replace the _log4j_ with _slf4j_ in this patch, we track the effort of 
 removing the guards when calling LOG.debug() and LOG.trace() in jira 
 [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


[ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720793#comment-14720793
 ] 

Haohui Mai commented on HDFS-8990:
--

{code}
   public int available() throws IOException {
 // An optimistic estimate of how much data is available
 // to us without doing network I/O.
-return DFSClient.TCP_WINDOW_SIZE;
+return HdfsClientConfigKeys.DFS_CLIENT_CACHED_CONN_RETRY_DEFAULT;
   }
{code}

This is the wrong constant.

{code}
--- 
a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
@@ -46,6 +46,7 @@
   int DFS_NAMENODE_RPC_PORT_DEFAULT = 8020;
   String DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY =
   dfs.namenode.kerberos.principal;
+  int DFS_CLIENT_TCP_WINDOW_SIZE = 128 * 1024; // 128 KB
   String  DFS_CLIENT_WRITE_PACKET_SIZE_KEY = dfs.client-write-packet-size;
   int DFS_CLIENT_WRITE_PACKET_SIZE_DEFAULT = 64*1024;
   String  DFS_CLIENT_SOCKET_TIMEOUT_KEY = dfs.client.socket-timeout;
{coe}

{{TCP_WINDOW_SIZE}} is a constant that is only used by the 
{{RemoteBlockReader}} / {{RemoteBlockReader2}}. Let's put it into 
{{RemoteBlockBlockReader2}} instead.

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Attachments: HDFS-8990.000.patch


 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in 
 jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].
 The extant checkstyle warnings can be fixed in 
 [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
 replace the _log4j_ with _slf4j_ in this patch, we track the effort of 
 removing the guards when calling LOG.debug() and LOG.trace() in jira 
 [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager


[ 
https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720796#comment-14720796
 ] 

Hudson commented on HDFS-8938:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2267 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2267/])
HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes 
from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 
6d12cd8d609dec26d44cece9937c35b7d72a3cd1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


 Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from 
 BlockManager
 --

 Key: HDFS-8938
 URL: https://issues.apache.org/jira/browse/HDFS-8938
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0

 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, 
 HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, 
 HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, 
 HDFS-8938.008.patch


 This jira proposes to refactor two inner static classes, 
 {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to 
 standalone classes. The refactor also improves readability by abstracting the 
 complexity of scheduling and validating replications to corresponding helper 
 methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Attachment: HDFS-8990.001.patch

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Attachments: HDFS-8990.000.patch, HDFS-8990.001.patch


 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in 
 jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].
 The extant checkstyle warnings can be fixed in 
 [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
 replace the _log4j_ with _slf4j_ in this patch, we track the effort of 
 removing the guards when calling LOG.debug() and LOG.trace() in jira 
 [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8983) NameNode support for protected directories

2015-08-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8983:

Attachment: HDFS-8983.03.patch

 NameNode support for protected directories
 --

 Key: HDFS-8983
 URL: https://issues.apache.org/jira/browse/HDFS-8983
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, 
 HDFS-8983.03.patch


 To protect important system directories from inadvertent deletion (e.g. 
 /Users) the NameNode can allow marking directories as _protected_. Such 
 directories cannot be deleted unless they are empty. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8990) Move RemoteBlockReader to hdfs-client module

Mingliang Liu created HDFS-8990:
---

 Summary: Move RemoteBlockReader to hdfs-client module
 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0


This jira tracks the effort of moving the {{BlockReader}} class into the 
hdfs-client module. We also move {{BlockReaderLocal}} class which implements 
the {{BlockReader}} interface to {{hdfs-client}} module. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Fix Version/s: (was: 2.8.0)

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu

 This jira tracks the effort of moving the {{BlockReader}} class into the 
 hdfs-client module. We also move {{BlockReaderLocal}} class which implements 
 the {{BlockReader}} interface to {{hdfs-client}} module. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Hadoop Flags:   (was: Reviewed)

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu

 This jira tracks the effort of moving the {{BlockReader}} class into the 
 hdfs-client module. We also move {{BlockReaderLocal}} class which implements 
 the {{BlockReader}} interface to {{hdfs-client}} module. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors

2015-08-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720699#comment-14720699
 ] 

Colin Patrick McCabe commented on HDFS-8965:


bq. Do we also already have tests for invalid op lengths (e.g. greater than max 
op size)? I see testFuzzSequences but that's not explicit.

{{TestNameNodeRecovery#testNonDefaultMaxOpSize}} tests maximum op sizes.

The latest patch fixes the test failure in {{TestJournal}}.  The issue was that 
we need to ensure that {{scanOp}} works when the edit log version is newer than 
the latest version.

 Harden edit log reading code against out of memory errors
 -

 Key: HDFS-8965
 URL: https://issues.apache.org/jira/browse/HDFS-8965
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, 
 HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch


 We should harden the edit log reading code against out of memory errors.  Now 
 that each op has a length prefix and a checksum, we can validate the checksum 
 before trying to load the Op data.  This should avoid out of memory errors 
 when trying to load garbage data as Op data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8965) Harden edit log reading code against out of memory errors

2015-08-28 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8965:
---
Attachment: HDFS-8965.005.patch

 Harden edit log reading code against out of memory errors
 -

 Key: HDFS-8965
 URL: https://issues.apache.org/jira/browse/HDFS-8965
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, 
 HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch


 We should harden the edit log reading code against out of memory errors.  Now 
 that each op has a length prefix and a checksum, we can validate the checksum 
 before trying to load the Op data.  This should avoid out of memory errors 
 when trying to load garbage data as Op data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Description: This jira tracks the effort of moving the 
{{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} 
interface and {{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module 
in [HDFS-8925|]  (was: This jira tracks the effort of moving the 
{{BlockReader}} class into the hdfs-client module. We also move 
{{BlockReaderLocal}} class which implements the {{BlockReader}} interface to 
{{hdfs-client}} module. )

 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu

 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module in 
 [HDFS-8925|]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Description: 
This jira tracks the effort of moving the {{RemoteBlockReader}} class into the 
{{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} 
class were moved to {{hadoop-hdfs-client}} module in jira 
[HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].

The extant checkstyle warnings can be fixed in 
[HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
replace the _log4j_ with _slf4j_, we track the effort of removing the guards 
when calling LOG.debug() and LOG.trace() in jira 
[HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]

  was:This jira tracks the effort of moving the {{RemoteBlockReader}} class 
into the {{hdfs-client}} module. {{BlockReader}} interface and 
{{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module in [HDFS-8925|]


 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu

 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in 
 jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].
 The extant checkstyle warnings can be fixed in 
 [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
 replace the _log4j_ with _slf4j_, we track the effort of removing the guards 
 when calling LOG.debug() and LOG.trace() in jira 
 [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module


 [ 
https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8990:

Description: 
This jira tracks the effort of moving the {{RemoteBlockReader}} class into the 
{{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} 
class were moved to {{hadoop-hdfs-client}} module in jira 
[HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].

The extant checkstyle warnings can be fixed in 
[HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing 
the guards when calling LOG.debug() and LOG.trace() in jira 
[HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971].

  was:
This jira tracks the effort of moving the {{RemoteBlockReader}} class into the 
{{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} 
class were moved to {{hadoop-hdfs-client}} module in jira 
[HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].

The extant checkstyle warnings can be fixed in 
[HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
replace the _log4j_ with _slf4j_, we track the effort of removing the guards 
when calling LOG.debug() and LOG.trace() in jira 
[HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]


 Move RemoteBlockReader to hdfs-client module
 

 Key: HDFS-8990
 URL: https://issues.apache.org/jira/browse/HDFS-8990
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu

 This jira tracks the effort of moving the {{RemoteBlockReader}} class into 
 the {{hdfs-client}} module. {{BlockReader}} interface and 
 {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in 
 jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925].
 The extant checkstyle warnings can be fixed in 
 [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to 
 replace the _log4j_ with _slf4j_ in this patch, we track the effort of 
 removing the guards when calling LOG.debug() and LOG.trace() in jira 
 [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list


[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720727#comment-14720727
 ] 

Hudson commented on HDFS-8950:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #324 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/324/])
HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the 
allowed list (Daniel Templeton) (cmccabe: rev 
b94b56806d3d6e04984e229b479f7ac15b62bbfa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
 list
 

 Key: HDFS-8950
 URL: https://issues.apache.org/jira/browse/HDFS-8950
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
 Fix For: 2.8.0

 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
 HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch


 If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
 refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
 It may try to allocate some blocks to that DN as well during an MR job.  This 
 issue is independent from DN decommission.
 To reproduce:
 1. Add a DN to dfs_hosts_allow
 2. Refresh NN
 3. Start DN. Now NN starts seeing DN.
 4. Stop DN
 5. Remove DN from dfs_hosts_allow
 6. Refresh NN - NN is still reporting DN as being used by HDFS.
 This is different from decom because there DN is added to exclude list in 
 addition to being removed from allowed list, and in that case everything 
 works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager


[ 
https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720728#comment-14720728
 ] 

Hudson commented on HDFS-8938:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #324 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/324/])
HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes 
from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 
6d12cd8d609dec26d44cece9937c35b7d72a3cd1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java


 Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from 
 BlockManager
 --

 Key: HDFS-8938
 URL: https://issues.apache.org/jira/browse/HDFS-8938
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0

 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, 
 HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, 
 HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, 
 HDFS-8938.008.patch


 This jira proposes to refactor two inner static classes, 
 {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to 
 standalone classes. The refactor also improves readability by abstracting the 
 complexity of scheduling and validating replications to corresponding helper 
 methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list


[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720615#comment-14720615
 ] 

Hudson commented on HDFS-8950:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8367 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8367/])
HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the 
allowed list (Daniel Templeton) (cmccabe: rev 
b94b56806d3d6e04984e229b479f7ac15b62bbfa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
 list
 

 Key: HDFS-8950
 URL: https://issues.apache.org/jira/browse/HDFS-8950
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
 Fix For: 2.8.0

 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
 HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch


 If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
 refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
 It may try to allocate some blocks to that DN as well during an MR job.  This 
 issue is independent from DN decommission.
 To reproduce:
 1. Add a DN to dfs_hosts_allow
 2. Refresh NN
 3. Start DN. Now NN starts seeing DN.
 4. Stop DN
 5. Remove DN from dfs_hosts_allow
 6. Refresh NN - NN is still reporting DN as being used by HDFS.
 This is different from decom because there DN is added to exclude list in 
 addition to being removed from allowed list, and in that case everything 
 works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8925) Move BlockReaderLocal to hdfs-client


 [ 
https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8925:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution.

 Move BlockReaderLocal to hdfs-client
 

 Key: HDFS-8925
 URL: https://issues.apache.org/jira/browse/HDFS-8925
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: build
Reporter: Mingliang Liu
Assignee: Mingliang Liu
 Fix For: 2.8.0

 Attachments: HDFS-8925.000.patch, HDFS-8925.001.patch, 
 HDFS-8925.002.patch


 This jira tracks the effort of moving the {{BlockReader}} class into the 
 hdfs-client module. We also move {{BlockReaderLocal}} class which implements 
 the {{BlockReader}} interface to {{hdfs-client}} module. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8865) Improve quota initialization performance


[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720620#comment-14720620
 ] 

Hudson commented on HDFS-8865:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2247 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2247/])
HDFS-8865. Improve quota initialization performance. Contributed by Kihwal Lee. 
(kihwal: rev b6ceee9bf42eec15891f60a014bbfa47e03f563c)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/QuotaCounts.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDiskspaceQuotaUpdate.java


 Improve quota initialization performance
 

 Key: HDFS-8865
 URL: https://issues.apache.org/jira/browse/HDFS-8865
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kihwal Lee
Assignee: Kihwal Lee
 Fix For: 3.0.0, 2.8.0

 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
 HDFS-8865.v2.patch, HDFS-8865.v3.patch


 After replaying edits, the whole file system tree is recursively scanned in 
 order to initialize the quota. For big name space, this can take a very long 
 time.  Since this is done during namenode failover, it also affects failover 
 latency.
 By using the Fork-Join framework, I was able to greatly reduce the 
 initialization time.  The following is the test result using the fsimage from 
 one of the big name nodes we have.
 || threads || seconds||
 | 1 (existing) | 55|
 | 1 (fork-join) | 68 |
 | 4 | 16 |
 | 8 | 8 |
 | 12 | 6 |
 | 16 | 5 |
 | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8983) NameNode support for protected directories