[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete

2015-09-25 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909087#comment-14909087
 ] 

Brandon Li commented on HDFS-9092:
--

+1. Patch looks good to me. Thank you [~yzhangal]

> Nfs silently drops overlapping write requests, thus data copying can't 
> complete
> ---
>
> Key: HDFS-9092
> URL: https://issues.apache.org/jira/browse/HDFS-9092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.1
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9092.001.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909078#comment-14909078
 ] 

Hudson commented on HDFS-9132:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #419 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/419/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909080#comment-14909080
 ] 

Hudson commented on HDFS-9107:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #419 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/419/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909079#comment-14909079
 ] 

Hudson commented on HDFS-9133:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #419 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/419/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete

2015-09-25 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909083#comment-14909083
 ] 

Brandon Li commented on HDFS-9092:
--

For category2, the assumption of the fix is that, the trimmed data is the same 
as what's already written to HDFS.

So far we claim that NFS gateway supports the user cases of file uploading and 
file streaming. 
For file uploading, the overlapped section is safe to drop since it will be the 
same as what's already written to HDFS. It's the same case for file streaming. 

The only possible problem is this: before the patch, if users do random update 
to an HDFS file, NFS gateway will report error. With this patch, there is a 
chance we wont's see the error if it happens that the updated section is 
trimmed.

Since random write is not supported anyway, the possible nicer reaction to a 
random update seems still acceptable to me.   

> Nfs silently drops overlapping write requests, thus data copying can't 
> complete
> ---
>
> Key: HDFS-9092
> URL: https://issues.apache.org/jira/browse/HDFS-9092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.1
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9092.001.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9044) Give Priority to FavouredNodes , before selecting nodes from FavouredNode's Node Group

2015-09-25 Thread J.Andreina (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908123#comment-14908123
 ] 

J.Andreina commented on HDFS-9044:
--

[~vinayrpet] and [~szetszwo] , can you please have a look at this jira and 
provide your feedback

> Give Priority to FavouredNodes , before selecting nodes from FavouredNode's 
> Node Group
> --
>
> Key: HDFS-9044
> URL: https://issues.apache.org/jira/browse/HDFS-9044
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: J.Andreina
>Assignee: J.Andreina
> Attachments: HDFS-9044.1.patch, HDFS-9044.2.patch, HDFS-9044.3.patch, 
> HDFS-9044.4.patch
>
>
> Passing Favored nodes intention is to place replica among the favored node
> Current behavior in Node group is 
>   If favored node is not available it goes to one among favored 
> nodegroup. 
> {noformat}
> Say for example:
>   1)I need 3 replicas and passed 5 favored nodes.
>   2)Out of 5 favored nodes 3 favored nodes are not good.
>   3)Then based on BlockPlacementPolicyWithNodeGroup out of 5 targets node 
> returned , 3 will be random node from 3 bad FavoredNode's nodegroup. 
>   4)Then there is a probability that all my 3 replicas are placed on 
> Random node from FavoredNodes's nodegroup , instead of giving priority to 2 
> favored nodes returned as target.
> {noformat}
> *Instead of returning 5 targets on 3rd step above , we can return 2 good 
> favored nodes as target*
> *And remaining 1 needed replica can be chosen from Random node of bad 
> FavoredNodes's nodegroup.*
> This will make sure that the FavoredNodes are given priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-25 Thread Casey Brotherton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908125#comment-14908125
 ] 

Casey Brotherton commented on HDFS-9100:


Hello
hdfs tests were related to hflush, and secondary namenode webui.  Both of which 
I didn't change.

Should I re-publish the patch to see if those tests resolve?

Thanks,
Casey

> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy

2015-09-25 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8647:
---
Attachment: HDFS-8647-004.patch

> Abstract BlockManager's rack policy into BlockPlacementPolicy
> -
>
> Key: HDFS-8647
> URL: https://issues.apache.org/jira/browse/HDFS-8647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, 
> HDFS-8647-003.patch, HDFS-8647-004.patch
>
>
> Sometimes we want to have namenode use alternative block placement policy 
> such as upgrade domains in HDFS-7541.
> BlockManager has built-in assumption about rack policy in functions such as 
> useDelHint, blockHasEnoughRacks. That means when we have new block placement 
> policy, we need to modify BlockManager to account for the new policy. Ideally 
> BlockManager should ask BlockPlacementPolicy object instead. That will allow 
> us to provide new BlockPlacementPolicy without changing BlockManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6264) Provide FileSystem#create() variant which throws exception if parent directory doesn't exist

2015-09-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6264:
-
Attachment: hdfs-6264-v2.txt

> Provide FileSystem#create() variant which throws exception if parent 
> directory doesn't exist
> 
>
> Key: HDFS-6264
> URL: https://issues.apache.org/jira/browse/HDFS-6264
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>  Labels: hbase
> Attachments: hdfs-6264-v1.txt, hdfs-6264-v2.txt
>
>
> FileSystem#createNonRecursive() is deprecated.
> However, there is no DistributedFileSystem#create() implementation which 
> throws exception if parent directory doesn't exist.
> This limits clients' migration away from the deprecated method.
> For HBase, IO fencing relies on the behavior of 
> FileSystem#createNonRecursive().
> Variant of create() method should be added which throws exception if parent 
> directory doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot

2015-09-25 Thread Ajith S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated HDFS-4167:
--
Attachment: HDFS-4167.05.patch

Updated to trunk and added support for CLI. Please review

> Add support for restoring/rolling back to a snapshot
> 
>
> Key: HDFS-4167
> URL: https://issues.apache.org/jira/browse/HDFS-4167
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Ajith S
> Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, 
> HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, 
> HDFS-4167.05.patch
>
>
> This jira tracks work related to restoring a directory/file to a snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy

2015-09-25 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908271#comment-14908271
 ] 

Brahma Reddy Battula commented on HDFS-8647:


[~mingma] Uploaded the patch..Kindly review..

> Abstract BlockManager's rack policy into BlockPlacementPolicy
> -
>
> Key: HDFS-8647
> URL: https://issues.apache.org/jira/browse/HDFS-8647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, 
> HDFS-8647-003.patch, HDFS-8647-004.patch
>
>
> Sometimes we want to have namenode use alternative block placement policy 
> such as upgrade domains in HDFS-7541.
> BlockManager has built-in assumption about rack policy in functions such as 
> useDelHint, blockHasEnoughRacks. That means when we have new block placement 
> policy, we need to modify BlockManager to account for the new policy. Ideally 
> BlockManager should ask BlockPlacementPolicy object instead. That will allow 
> us to provide new BlockPlacementPolicy without changing BlockManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7899) Improve EOF error message

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908153#comment-14908153
 ] 

Hadoop QA commented on HDFS-7899:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 29s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | |  45m  1s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762388/HDFS-7899-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e65c5 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12677/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12677/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12677/console |


This message was automatically generated.

> Improve EOF error message
> -
>
> Key: HDFS-7899
> URL: https://issues.apache.org/jira/browse/HDFS-7899
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Jagadesh Kiran N
>Priority: Minor
> Attachments: HDFS-7899-00.patch, HDFS-7899-01.patch
>
>
> Currently, a DN disconnection for reasons other than connection timeout or 
> refused messages, such as an EOF message as a result of rejection or other 
> network fault, reports in this manner:
> {code}
> WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /x.x.x.x: for 
> block, add to deadNodes and continue. java.io.EOFException: Premature EOF: no 
> length prefix available 
> java.io.EOFException: Premature EOF: no length prefix available 
> at 
> org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171)
>  
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:392)
>  
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:137)
>  
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1103)
>  
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:538) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:750)
>  
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:794) 
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:602) 
> {code}
> This is not very clear to a user (warn's at the hdfs-client). It could likely 
> be improved with a more diagnosable message, or at least the direct reason 
> than an EOF.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot

2015-09-25 Thread Ajith S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated HDFS-4167:
--
Status: Patch Available  (was: Open)

> Add support for restoring/rolling back to a snapshot
> 
>
> Key: HDFS-4167
> URL: https://issues.apache.org/jira/browse/HDFS-4167
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Ajith S
> Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, 
> HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, 
> HDFS-4167.05.patch
>
>
> This jira tracks work related to restoring a directory/file to a snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-09-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908128#comment-14908128
 ] 

Chen He commented on HDFS-8673:
---

That is true if JVM crashes. [~andreina]. 

There are 2 cases that file ends with "._COPYING_":
1. it is called "file._COPYING_" in the beginning;
2. it is leftover file because of failure(s)

How about we change the warning message and add something like:"please remove 
'file1._COPYING_' first". 

> HDFS reports file already exists if there is a file/dir name end with 
> ._COPYING_
> 
>
> Key: HDFS-8673
> URL: https://issues.apache.org/jira/browse/HDFS-8673
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Chen He
>Assignee: Chen He
> Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
> HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
> HDFS-8673.003.patch
>
>
> Because CLI is using CommandWithDestination.java which add "._COPYING_" to 
> the tail of file name when it does the copy. It will cause problem if there 
> is a file/dir already called *._COPYING_ on HDFS.
> For file:
> -bash-4.1$ hadoop fs -put 5M /user/occ/
> -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
> For dir:
> -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> put: /user/occ/5M._COPYING_ already exists as a directory
> -bash-4.1$ hadoop fs -ls /user/occ/
> (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9144) Refactor libhdfs into stateful/ephemeral objects

2015-09-25 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-9144:


 Summary: Refactor libhdfs into stateful/ephemeral objects
 Key: HDFS-9144
 URL: https://issues.apache.org/jira/browse/HDFS-9144
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-8707
Reporter: Bob Hansen


In discussion for other efforts, we decided that we should separate several 
concerns:

* A posix-like FileSystem/FileHandle object (stream-based, positional reads)
* An ephemeral ReadOperation object that holds the state for reads-in-progress, 
which consumes
* An immutable FileInfo object which holds the block map and file size (and 
other metadata about the file that we assume will not change over the life of 
the file)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7663) Erasure Coding: lease recovery / append on striped file

2015-09-25 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-7663:

Attachment: HDFS-7663.00.txt

> Erasure Coding: lease recovery / append on striped file
> ---
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2015-09-25 Thread Constantine Peresypkin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908412#comment-14908412
 ] 

Constantine Peresypkin commented on HDFS-3107:
--

OK, great success after patching write cache.
Where do I post the code?

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.0
>
> Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
> HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
> HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-6674) UserGroupInformation.loginUserFromKeytab will hang forever if keytab file length is less than 6 byte.

2015-09-25 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-6674.
---
Resolution: Invalid

The hang, if still valid, seems to result as an outcome of the underlying Java 
libraries being at fault. There's not anything HDFS can control about this, and 
this bug instead needs to be reported to the Oracle/OpenJDK communities with a 
test case.

> UserGroupInformation.loginUserFromKeytab will hang forever if keytab file 
> length  is less than 6 byte.
> --
>
> Key: HDFS-6674
> URL: https://issues.apache.org/jira/browse/HDFS-6674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.1-alpha
>Reporter: liuyang
>Priority: Minor
>
> The jstack is as follows:
>java.lang.Thread.State: RUNNABLE
>   at java.io.FileInputStream.available(Native Method)
>   at java.io.BufferedInputStream.available(BufferedInputStream.java:399)
>   - locked <0x000745585330> (a 
> sun.security.krb5.internal.ktab.KeyTabInputStream)
>   at sun.security.krb5.internal.ktab.KeyTab.load(KeyTab.java:257)
>   at sun.security.krb5.internal.ktab.KeyTab.(KeyTab.java:97)
>   at sun.security.krb5.internal.ktab.KeyTab.getInstance0(KeyTab.java:124)
>   - locked <0x000745586560> (a java.lang.Class for 
> sun.security.krb5.internal.ktab.KeyTab)
>   at sun.security.krb5.internal.ktab.KeyTab.getInstance(KeyTab.java:157)
>   at javax.security.auth.kerberos.KeyTab.takeSnapshot(KeyTab.java:119)
>   at 
> javax.security.auth.kerberos.KeyTab.getEncryptionKeys(KeyTab.java:192)
>   at 
> javax.security.auth.kerberos.JavaxSecurityAuthKerberosAccessImpl.keyTabGetEncryptionKeys(JavaxSecurityAuthKerberosAccessImpl.java:36)
>   at 
> sun.security.jgss.krb5.Krb5Util.keysFromJavaxKeyTab(Krb5Util.java:381)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:701)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at javax.security.auth.login.LoginContext.invoke(LoginContext.java:784)
>   at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
>   at javax.security.auth.login.LoginContext$5.run(LoginContext.java:721)
>   at javax.security.auth.login.LoginContext$5.run(LoginContext.java:719)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:718)
>   at javax.security.auth.login.LoginContext.login(LoginContext.java:590)
>   at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:679)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8053:

Attachment: HDFS-8053.004.patch

Thanks [~wheat9]. The v4 patch addresses the latest comments.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch, HDFS-8053.003.patch, HDFS-8053.004.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9112:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~anu] for the 
contribution! Also thanks for the review [~templedf]!

> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908695#comment-14908695
 ] 

Hadoop QA commented on HDFS-9139:
-

(!) A patch to the files used for the QA process has been detected. 
Re-executing against the patched versions to perform further tests. 
The console is at 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/console in case of 
problems.

> Enable parallel JUnit tests for HDFS Pre-commit 
> 
>
> Key: HDFS-9139
> URL: https://issues.apache.org/jira/browse/HDFS-9139
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9139.01.patch
>
>
> Forked from HADOOP-11984, 
> With the initial and significant work from [~cnauroth], this Jira is to track 
> and support parallel tests' run for HDFS Precommit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908708#comment-14908708
 ] 

Colin Patrick McCabe commented on HDFS-9132:


The ReplicaAccessor may be used to access either finalized or non-finalized 
replicas.  The genstamp may not matter for some implementations, but others may 
want to check it and fall back to a different access method if it doesn't match 
the most current genstamp.

> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908720#comment-14908720
 ] 

Colin Patrick McCabe commented on HDFS-9133:


bq. Thanks Colin for the work. The patch itself looks good, +1.

Thanks

bq. For other block readers, the length means the number of bytes to read, but 
in ExternalBlockReader, it means "visible length" which assumes it's the block 
replica length? Since from the calculation of skip and available in 
ExternalBlockReader , it indicates it assumes it's the block replica length.

You are correct... we should not be treating this as the {{visibleLength}} of 
the block, since it's really the number of bytes in between the end of the 
block and the current offset within the block.  Let me file another jira to fix 
this.

> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908727#comment-14908727
 ] 

Colin Patrick McCabe commented on HDFS-9133:


I filed HDFS-9147 to fix the naming and documentation of the 
ExternalBlockReader length field.

> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9112) Haadmin fails if multiple name service IDs are configured

2015-09-25 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9112:
---
Attachment: HDFS-9112.004.patch

[~templedf] Thanks for your comments, I have taken care of both issues pointed 
out by you.


> Haadmin fails if multiple name service IDs are configured
> -
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit

2015-09-25 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908595#comment-14908595
 ] 

Vinayakumar B edited comment on HDFS-9139 at 9/25/15 8:43 PM:
--

adding first patch.
Since HADOOP-11984, is not yet in, this includes those changes as well.


was (Author: vinayrpet):
adding first patch.
Since HADOOP-11984, is not yet in, this icludes those changes as well ?

> Enable parallel JUnit tests for HDFS Pre-commit 
> 
>
> Key: HDFS-9139
> URL: https://issues.apache.org/jira/browse/HDFS-9139
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9139.01.patch
>
>
> Forked from HADOOP-11984, 
> With the initial and significant work from [~cnauroth], this Jira is to track 
> and support parallel tests' run for HDFS Precommit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9112:

Issue Type: Improvement  (was: Bug)

> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9112:

Summary: Improve error message for Haadmin when multiple name service IDs 
are configured  (was: Haadmin fails if multiple name service IDs are configured)

> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7766) Add a flag to WebHDFS op=CREATE to not respond with a 307 redirect

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908700#comment-14908700
 ] 

Hadoop QA commented on HDFS-7766:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  23m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   7m 57s | The applied patch generated  4  
additional warning messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  3s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m 28s | The applied patch generated  1 
new checkstyle issues (total was 0, now 1). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 169m 11s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 31s | Tests passed in 
hadoop-hdfs-client. |
| | | 227m 30s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762415/HDFS-7766.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 83e65c5 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12681/console |


This message was automatically generated.

> Add a flag to WebHDFS op=CREATE to not respond with a 307 redirect
> --
>
> Key: HDFS-7766
> URL: https://issues.apache.org/jira/browse/HDFS-7766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7766.01.patch, HDFS-7766.02.patch, 
> HDFS-7766.03.patch
>
>
> Please see 
> https://issues.apache.org/jira/browse/HDFS-7588?focusedCommentId=14276192=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14276192
> A backwards compatible manner we can fix this is to add a flag on the request 
> which would disable the redirect, i.e.
> {noformat}
> curl -i -X PUT 
> "http://:/webhdfs/v1/?op=CREATE[=]
> {noformat}
> returns 200 with the DN location in the response.
> This would allow the Browser clients to get the redirect URL to put the file 
> to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2015-09-25 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9145:

Description: It will be helpful that if we can have a way to track (or at 
least log a msg) if some operation is holding the FSNamesystem lock for a long 
time.  (was: It will be helpful that if we can have a way to track (or at least 
log a msg) if some operation is hold the FSNamesystem lock for a long time.)

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-25 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908715#comment-14908715
 ] 

Bob Hansen commented on HDFS-8855:
--

Given that this is in response to an HTTP request, there's already plenty of 
string manipulation and network I/O going on.  It will be a long time until the 
string construction is our bottleneck.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.005.patch, HDFS-8855.1.patch, 
> HDFS-8855.2.patch, HDFS-8855.3.patch, HDFS-8855.4.patch, 
> HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Haadmin fails if multiple name service IDs are configured

2015-09-25 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908656#comment-14908656
 ] 

Daniel Templeton commented on HDFS-9112:


Thanks, [~anu].  +1 (non-bindng)

(That last newline isn't really needed.  Newlines between braces don't help 
much.)

> Haadmin fails if multiple name service IDs are configured
> -
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908707#comment-14908707
 ] 

Colin Patrick McCabe commented on HDFS-8873:


Another repeat of the jenkins "noclassdef found" glitch.  Re-triggering the 
build. +1 pending jenkins

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch, HDFS-8873.004.patch, HDFS-8873.005.patch, 
> HDFS-8873.006.patch, HDFS-8873.007.patch, HDFS-8873.008.patch, 
> HDFS-8873.009.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9147) ExternalBlockReader should not treat bytesRemaining as visibleLength

2015-09-25 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-9147:
--

 Summary: ExternalBlockReader should not treat bytesRemaining as 
visibleLength
 Key: HDFS-9147
 URL: https://issues.apache.org/jira/browse/HDFS-9147
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


ExternalBlockReader should not treat the bytesRemaining passed in from 
DFSInputStream as the visibleLength.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Haadmin fails if multiple name service IDs are configured

2015-09-25 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908307#comment-14908307
 ] 

Daniel Templeton commented on HDFS-9112:


Thanks, [~anu].  Good error message!  If you don't mind, I'd like to suggest 
some minor changes:

bq. "Unable to determine the nameservice id. This is an HA configuration with 
multiple Name services configured. " + DFS_NAMESERVICES + " value is set to " + 
Arrays.toString(dfsNames) + ". Please re-run with -ns option."

I would prefer:

{code}
"Unable to determine the name service ID. This is an HA configuration with 
multiple name services configured. " + DFS_NAMESERVICES + " is set to " + 
Arrays.toString(dfsNames) + ". Please re-run with the -ns option."
{code}

Sorry to play language police. :)

I would also rather see some use of newlines to make the code a little more 
readable.  I like to see a newline on either side of a code block, so, in this 
case, before each of the if statements and after the closing braces.  I'll +1 
one the patch (after the language changes) without newlines, but I think it 
would be swell to add some.

> Haadmin fails if multiple name service IDs are configured
> -
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908498#comment-14908498
 ] 

Haohui Mai commented on HDFS-8053:
--

{code}
+
+public class HdfsConfigurationLoader {
+
+  /** Load the default resources.
+   *
+   * This method is idempotent as the {@link Configuration#addDefaultResource}
+   * will only add the default resources once, in the order of their 
appearance.
+   */
+  public static void load() {
+// adds the default resources
+Configuration.addDefaultResource("hdfs-default.xml");
+Configuration.addDefaultResource("hdfs-site.xml");
+  }
+}
{code}

(1) The class should be a package local class. (2) This is inefficient and 
cannot guard against calling load() for multiple times. Please follow the 
paradigms in {{HdfsConfiguration}}.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch, HDFS-8053.003.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7663) Erasure Coding: lease recovery / append on striped file

2015-09-25 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908505#comment-14908505
 ] 

Jing Zhao edited comment on HDFS-7663 at 9/25/15 6:54 PM:
--

Thanks for sharing the thoughts, Walter! First I completely agree that we 
should start from #3 you proposed. More specifically, quote from your doc:
{quote}
If lastBlock is complete, start a new blockGroup.
If lastBlock is UC, start a blockRecovery. After recovery, start a new 
blockGroup. BlockRecovery prefers to save data from last paritial stripe as 
much as possible.

pros: simple and clear code, with HDFS-3689.
cons: a) Has paritial blockGroup, waste NN memory. b) Could hurt read 
performance.
{quote}
#3 is something we should start with. 

Some high level thinking:
# Lease/Block recovery is actually a different topic and is a tricky problem. 
As you suggested in HDFS-9040, we can create a separate jira for it, and the 
second half of your doc can be our first proposal there.
# Looks like till now we do not have a correct solution for EC file lease 
recovery (at least the protocol for DataNodes to do the recovery work)? We 
should first let NameNode *NOT* send recovery commands to DataNodes for under 
construction/recovery EC blocks.
# Appending to a partial block group is actually very similar to "keep writing 
after calling hsync/hflush", since we need to overwrite the parity data of the 
last stripe. We have an initial proposal in the latest design doc in HDFS-7285 
about this part.


was (Author: jingzhao):
Thanks for sharing the thoughts, Walter! First I completely agree that we 
should start from #3 you proposed. More specifically, quote from your doc:
{quote}
If lastBlock is complete, start a new blockGroup.
If lastBlock is UC, start a blockRecovery. After recovery, start a new 
blockGroup. BlockRecovery prefers to save data from last paritial stripe as 
much as possible.

pros: simple and clear code, with HDFS-3689.
cons: a) Has paritial blockGroup, waste NN memory. b) Could hurt read 
performance.
{quote}
This is something we can start with. 

Some high level thinking:
# Lease/Block recovery is actually a different topic and is a tricky problem. 
As you suggested in HDFS-9040, we can create a separate jira for it, and the 
second half of your doc can be our first proposal there.
# Looks like till now we do not have a correct solution for EC file lease 
recovery (at least the protocol for DataNodes to do the recovery work)? We 
should first let NameNode *NOT* send recovery commands to DataNodes for under 
construction/recovery EC blocks.
# Appending to a partial block group is actually very similar to "keep writing 
after calling hsync/hflush", since we need to overwrite the parity data of the 
last stripe. We have an initial proposal in the latest design doc in HDFS-7285 
about this part.

> Erasure Coding: lease recovery / append on striped file
> ---
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2015-09-25 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908517#comment-14908517
 ] 

Jitendra Nath Pandey commented on HDFS-9145:


Thanks for filing this [~jingzhao]. It will be super useful.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is hold the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9142) Namenode Http address is not configured correctly for federated cluster in MiniDFSCluster

2015-09-25 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-9142:
--
Attachment: HDFS-9142.v3.patch

> Namenode Http address is not configured correctly for federated cluster in 
> MiniDFSCluster
> -
>
> Key: HDFS-9142
> URL: https://issues.apache.org/jira/browse/HDFS-9142
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: HDFS-9142.v1.patch, HDFS-9142.v2.patch, 
> HDFS-9142.v3.patch
>
>
> When setting up simpleHAFederatedTopology in MiniDFSCluster, each Namenode 
> should have its own configuration object, and the configuration should have 
> "dfs.namenode.http-address--" set up correctly for 
> all  pair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908540#comment-14908540
 ] 

Owen O'Malley commented on HDFS-8855:
-

This is ok. +1

I'm a little concerned about the runtime performance of generating the string 
of the identifier on every connection to the datanode, but this should be 
correct.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.005.patch, HDFS-8855.1.patch, 
> HDFS-8855.2.patch, HDFS-8855.3.patch, HDFS-8855.4.patch, 
> HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9146) HDFS forward seek() within a block shouldn't spawn new TCP Peer/RemoteBlockReader

2015-09-25 Thread Gopal V (JIRA)
Gopal V created HDFS-9146:
-

 Summary: HDFS forward seek() within a block shouldn't spawn new 
TCP Peer/RemoteBlockReader
 Key: HDFS-9146
 URL: https://issues.apache.org/jira/browse/HDFS-9146
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1, 2.6.0, 2.8.0
Reporter: Gopal V


When a seek() + forward readFully() is triggered from a remote dfsclient, HDFS 
opens a new remote block reader even if the seek is within the same HDFS block.

(analysis from [~rajesh.balamohan])

This is due to the fact that a simple read operation assumes that the user is 
going to read till the end of the block.

{code}
  try {
blockReader = getBlockReader(targetBlock, offsetIntoBlock,
targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
storageType, chosenNode);
{code}

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L624

Since the user hasn't read till the end of the block when the next seek 
happens, the BlockReader assumes this is an aborted read and tries to throw 
away the TCP peer it has got.

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java#L324

{code}
// If we've now satisfied the whole client read, read one last packet
// header, which should be empty
if (bytesNeededToFinish <= 0) {
  readTrailingEmptyPacket(); 
 ...
  sendReadResult(Status.SUCCESS);
{code}

Since that is not satisfied, the status code is unset & the peer is not 
returned to the cache.

{code}
if (peerCache != null && sentStatusCode) {
  peerCache.put(datanodeID, peer);
} else {
  peer.close();
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908590#comment-14908590
 ] 

Hadoop QA commented on HDFS-8647:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 10s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 26s | The applied patch generated  2 
new checkstyle issues (total was 332, now 324). |
| {color:red}-1{color} | whitespace |   0m  2s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 168m 28s | Tests failed in hadoop-hdfs. |
| | | 214m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.fs.viewfs.TestViewFileSystemHdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762402/HDFS-8647-004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e65c5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12680/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12680/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12680/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12680/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12680/console |


This message was automatically generated.

> Abstract BlockManager's rack policy into BlockPlacementPolicy
> -
>
> Key: HDFS-8647
> URL: https://issues.apache.org/jira/browse/HDFS-8647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, 
> HDFS-8647-003.patch, HDFS-8647-004.patch
>
>
> Sometimes we want to have namenode use alternative block placement policy 
> such as upgrade domains in HDFS-7541.
> BlockManager has built-in assumption about rack policy in functions such as 
> useDelHint, blockHasEnoughRacks. That means when we have new block placement 
> policy, we need to modify BlockManager to account for the new policy. Ideally 
> BlockManager should ask BlockPlacementPolicy object instead. That will allow 
> us to provide new BlockPlacementPolicy without changing BlockManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit

2015-09-25 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9139:

Attachment: HDFS-9139.01.patch

adding first patch.
Since HADOOP-11984, is not yet in, this icludes those changes as well ?

> Enable parallel JUnit tests for HDFS Pre-commit 
> 
>
> Key: HDFS-9139
> URL: https://issues.apache.org/jira/browse/HDFS-9139
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9139.01.patch
>
>
> Forked from HADOOP-11984, 
> With the initial and significant work from [~cnauroth], this Jira is to track 
> and support parallel tests' run for HDFS Precommit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit

2015-09-25 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9139:

Status: Patch Available  (was: Open)

> Enable parallel JUnit tests for HDFS Pre-commit 
> 
>
> Key: HDFS-9139
> URL: https://issues.apache.org/jira/browse/HDFS-9139
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9139.01.patch
>
>
> Forked from HADOOP-11984, 
> With the initial and significant work from [~cnauroth], this Jira is to track 
> and support parallel tests' run for HDFS Precommit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7663) Erasure Coding: lease recovery / append on striped file

2015-09-25 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908505#comment-14908505
 ] 

Jing Zhao edited comment on HDFS-7663 at 9/25/15 6:53 PM:
--

Thanks for sharing the thoughts, Walter! First I completely agree that we 
should start from #3 you proposed. More specifically, quote from your doc:
{quote}
If lastBlock is complete, start a new blockGroup.
If lastBlock is UC, start a blockRecovery. After recovery, start a new 
blockGroup. BlockRecovery prefers to save data from last paritial stripe as 
much as possible.

pros: simple and clear code, with HDFS-3689.
cons: a) Has paritial blockGroup, waste NN memory. b) Could hurt read 
performance.
{quote}
This is something we can start with. 

Some high level thinking:
# Lease/Block recovery is actually a different topic and is a tricky problem. 
As you suggested in HDFS-9040, we can create a separate jira for it, and the 
second half of your doc can be our first proposal there.
# Looks like till now we do not have a correct solution for EC file lease 
recovery (at least the protocol for DataNodes to do the recovery work)? We 
should first let NameNode *NOT* send recovery commands to DataNodes for under 
construction/recovery EC blocks.
# Appending to a partial block group is actually very similar to "keep writing 
after calling hsync/hflush", since we need to overwrite the parity data of the 
last stripe. We have an initial proposal in the latest design doc in HDFS-7285 
about this part.


was (Author: jingzhao):
Thanks for sharing the thoughts, Walter! First I completely agree that we 
should start from #3 you proposed. More specifically, quote from your doc:
{quote}
If lastBlock is complete, start a new blockGroup.
If lastBlock is UC, start a blockRecovery. After recovery, start a new 
blockGroup. BlockRecovery prefers to save data from last paritial stripe as 
much as possible.

pros: simple and clear code, with HDFS-3689.
cons: a) Has paritial blockGroup, waste NN memory. b) Could hurt read 
performance.
{quote}
This is something we can easily support and also we should start with. 

Some high level thinking:
# Lease/Block recovery is actually a different topic and is a tricky problem. 
As you suggested in HDFS-9040, we can create a separate jira for it, and the 
second half of your doc can be our first proposal there.
# Looks like till now we do not have a correct solution for EC file lease 
recovery (at least the protocol for DataNodes to do the recovery work)? We 
should first let NameNode *NOT* send recovery commands to DataNodes for under 
construction/recovery EC blocks.
# Appending to a partial block group is actually very similar to "keep writing 
after calling hsync/hflush", since we need to overwrite the parity data of the 
last stripe. We have an initial proposal in the latest design doc in HDFS-7285 
about this part.

> Erasure Coding: lease recovery / append on striped file
> ---
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7663) Erasure Coding: lease recovery / append on striped file

2015-09-25 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908505#comment-14908505
 ] 

Jing Zhao commented on HDFS-7663:
-

Thanks for sharing the thoughts, Walter! First I completely agree that we 
should start from #3 you proposed. More specifically, quote from your doc:
{quote}
If lastBlock is complete, start a new blockGroup.
If lastBlock is UC, start a blockRecovery. After recovery, start a new 
blockGroup. BlockRecovery prefers to save data from last paritial stripe as 
much as possible.

pros: simple and clear code, with HDFS-3689.
cons: a) Has paritial blockGroup, waste NN memory. b) Could hurt read 
performance.
{quote}
This is something we can easily support and also we should start with. 

Some high level thinking:
# Lease/Block recovery is actually a different topic and is a tricky problem. 
As you suggested in HDFS-9040, we can create a separate jira for it, and the 
second half of your doc can be our first proposal there.
# Looks like till now we do not have a correct solution for EC file lease 
recovery (at least the protocol for DataNodes to do the recovery work)? We 
should first let NameNode *NOT* send recovery commands to DataNodes for under 
construction/recovery EC blocks.
# Appending to a partial block group is actually very similar to "keep writing 
after calling hsync/hflush", since we need to overwrite the parity data of the 
last stripe. We have an initial proposal in the latest design doc in HDFS-7285 
about this part.

> Erasure Coding: lease recovery / append on striped file
> ---
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot

2015-09-25 Thread Ajith S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated HDFS-4167:
--
Status: Open  (was: Patch Available)

> Add support for restoring/rolling back to a snapshot
> 
>
> Key: HDFS-4167
> URL: https://issues.apache.org/jira/browse/HDFS-4167
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Ajith S
> Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, 
> HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, 
> HDFS-4167.05.patch
>
>
> This jira tracks work related to restoring a directory/file to a snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8053:

Attachment: HDFS-8053.002.patch

The v2 patch fixes the javac, whitespace and release audit warnings. Some 
existing warnings are not addressed as this jira focuses on the code move.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7899) Improve EOF error message

2015-09-25 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908416#comment-14908416
 ] 

Jagadesh Kiran N commented on HDFS-7899:


As there is only change in Message, no test cases involved ,[~vinayrpet] 
,[~qwertymaniac] ,please review

> Improve EOF error message
> -
>
> Key: HDFS-7899
> URL: https://issues.apache.org/jira/browse/HDFS-7899
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Jagadesh Kiran N
>Priority: Minor
> Attachments: HDFS-7899-00.patch, HDFS-7899-01.patch
>
>
> Currently, a DN disconnection for reasons other than connection timeout or 
> refused messages, such as an EOF message as a result of rejection or other 
> network fault, reports in this manner:
> {code}
> WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /x.x.x.x: for 
> block, add to deadNodes and continue. java.io.EOFException: Premature EOF: no 
> length prefix available 
> java.io.EOFException: Premature EOF: no length prefix available 
> at 
> org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171)
>  
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:392)
>  
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:137)
>  
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1103)
>  
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:538) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:750)
>  
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:794) 
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:602) 
> {code}
> This is not very clear to a user (warn's at the hdfs-client). It could likely 
> be improved with a more diagnosable message, or at least the direct reason 
> than an EOF.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4167) Add support for restoring/rolling back to a snapshot

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908449#comment-14908449
 ] 

Hadoop QA commented on HDFS-4167:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  21m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 15s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 32s | The applied patch generated  1 
new checkstyle issues (total was 140, now 140). |
| {color:red}-1{color} | checkstyle |   3m 34s | The applied patch generated  1 
new checkstyle issues (total was 50, now 51). |
| {color:red}-1{color} | whitespace |   0m 33s | The patch has 5  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  23m 25s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  71m 33s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 29s | Tests passed in 
hadoop-hdfs-client. |
| | | 149m  6s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.fs.TestHarFileSystem |
|   | hadoop.fs.TestFilterFileSystem |
|   | hadoop.hdfs.TestParallelShortCircuitRead |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit |
|   | hadoop.hdfs.server.namenode.TestAllowFormat |
|   | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens |
|   | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.protocol.TestBlockListAsLongs |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics |
|   | hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | hadoop.cli.TestDeleteCLI |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory |
|   | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.tools.TestGetGroups |
|   | hadoop.hdfs.TestRemoteBlockReader2 |
|   | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.protocol.datatransfer.TestPacketReceiver |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
|   | hadoop.hdfs.TestRemoteBlockReader |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRename |
|   | hadoop.hdfs.TestBlockReaderLocal |
|   | hadoop.cli.TestCacheAdminCLI |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.namenode.TestNameNodeRecovery |
|   | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir |
|   | hadoop.hdfs.TestKeyProviderCache |
|   | hadoop.fs.loadGenerator.TestLoadGenerator |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.fs.TestFcHdfsSetUMask |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.crypto.TestHdfsCryptoStreams |
|   | hadoop.fs.viewfs.TestViewFsFileStatusHdfs |
|   | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl |
|   | hadoop.hdfs.server.namenode.ha.TestHAConfiguration |
|   | 

[jira] [Updated] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8053:

Attachment: HDFS-8053.003.patch

Thank you [~wheat9]. The v3 patch adds a new {{HdfsConfigurationLoader}} class 
which loads default resources files. The class is like following:
{code:title= HdfsConfigurationLoader.java}
public class HdfsConfigurationLoader {

  /** Load the default resources.
   *
   * This method is idempotent as the {@link Configuration#addDefaultResource}
   * will only add the default resources once, in the order of their appearance.
   */
  public static void load() {
// adds the default resources
Configuration.addDefaultResource("hdfs-default.xml");
Configuration.addDefaultResource("hdfs-site.xml");
  }
}
{code}
The {{DFSClient$Renewer}} will call this method instead of 
{{HdfsConfiguration#init}} at its static initialization section.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch, HDFS-8053.003.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6264) Provide FileSystem#create() variant which throws exception if parent directory doesn't exist

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908486#comment-14908486
 ] 

Hadoop QA commented on HDFS-6264:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | patch |   0m  0s | The patch file was not named 
according to hadoop's naming conventions. Please see 
https://wiki.apache.org/hadoop/HowToContribute for instructions. |
| {color:blue}0{color} | pre-patch |  22m 26s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 54s | The applied patch generated  1 
new checkstyle issues (total was 229, now 229). |
| {color:red}-1{color} | checkstyle |   3m 36s | The applied patch generated  1 
new checkstyle issues (total was 46, now 46). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   7m  4s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m  3s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | tools/hadoop tests |   1m 11s | Tests passed in 
hadoop-azure. |
| {color:green}+1{color} | hdfs tests | 166m 34s | Tests passed in hadoop-hdfs. 
|
| {color:green}+1{color} | hdfs tests |   0m 36s | Tests passed in 
hadoop-hdfs-client. |
| | | 245m 57s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762389/hdfs-6264-v2.txt |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e65c5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/diffcheckstylehadoop-common.txt
 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-azure test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/testrun_hadoop-azure.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12678/console |


This message was automatically generated.

> Provide FileSystem#create() variant which throws exception if parent 
> directory doesn't exist
> 
>
> Key: HDFS-6264
> URL: https://issues.apache.org/jira/browse/HDFS-6264
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>  Labels: hbase
> Attachments: hdfs-6264-v1.txt, hdfs-6264-v2.txt
>
>
> FileSystem#createNonRecursive() is deprecated.
> However, there is no DistributedFileSystem#create() implementation which 
> throws exception if parent directory doesn't exist.
> This limits clients' migration away from the deprecated method.
> For HBase, IO fencing relies on the behavior of 
> FileSystem#createNonRecursive().
> Variant of create() method should be added which throws exception if parent 
> directory doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7663) Erasure Coding: lease recovery / append on striped file

2015-09-25 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-7663:

Attachment: (was: HDFS-7663.00.txt)

> Erasure Coding: lease recovery / append on striped file
> ---
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7766) Add a flag to WebHDFS op=CREATE to not respond with a 307 redirect

2015-09-25 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-7766:
---
Attachment: HDFS-7766.03.patch

Here's a patch that includes unit tests.

> Add a flag to WebHDFS op=CREATE to not respond with a 307 redirect
> --
>
> Key: HDFS-7766
> URL: https://issues.apache.org/jira/browse/HDFS-7766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7766.01.patch, HDFS-7766.02.patch, 
> HDFS-7766.03.patch
>
>
> Please see 
> https://issues.apache.org/jira/browse/HDFS-7588?focusedCommentId=14276192=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14276192
> A backwards compatible manner we can fix this is to add a flag on the request 
> which would disable the redirect, i.e.
> {noformat}
> curl -i -X PUT 
> "http://:/webhdfs/v1/?op=CREATE[=]
> {noformat}
> returns 200 with the DN location in the response.
> This would allow the Browser clients to get the redirect URL to put the file 
> to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908409#comment-14908409
 ] 

Haohui Mai commented on HDFS-8053:
--

{code}
-static {
-  //Ensure that HDFS Configuration files are loaded before trying to use
-  // the renewer.
-  HdfsConfiguration.init();
-}
-
{code}

It might make sense to create a new class {{HdfsConfigurationLoader}} in the 
{{hadoop-hdfs-client}} package and load the default configuration there.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-25 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908445#comment-14908445
 ] 

Yongjun Zhang commented on HDFS-9100:
-

Hi [~caseyjbrotherton],

Thanks for the patch and the cluster testing. The patch looks good to me, two 
very minor cosmetic comments:

1. Change order of the following two lines:
{code}
import org.apache.hadoop.hdfs.client.HdfsClientConfigKeys;
import org.apache.hadoop.hdfs.DistributedFileSystem;
{code}

2. When a line is wrapped, the indention of the newline need to be 4 spaces per
the coding guideline, like:
{code}
NetUtils.createSocketAddr(target.getDatanodeInfo().
getXferAddr(Dispatcher.this.connectToDnViaHostname)),
HdfsConstants.READ_TIMEOUT);
{code}

Thanks.


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2015-09-25 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-9145:
---

 Summary: Tracking methods that hold FSNamesytemLock for too long
 Key: HDFS-9145
 URL: https://issues.apache.org/jira/browse/HDFS-9145
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Mingliang Liu


It will be helpful that if we can have a way to track (or at least log a msg) 
if some operation is hold the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908755#comment-14908755
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #447 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/447/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908763#comment-14908763
 ] 

Colin Patrick McCabe commented on HDFS-9107:


+1.  Test failures not related.  We can do cleanups in a follow-on.  Thanks, 
[~daryn]

> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9132:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   3.0.0
   Status: Resolved  (was: Patch Available)

+1. Thanks for the work from [~cmccabe] and reviews from [~hitliuyi]

Committed to trunk and branch2

> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908829#comment-14908829
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #441 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/441/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908831#comment-14908831
 ] 

Hudson commented on HDFS-9107:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #441 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/441/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908739#comment-14908739
 ] 

Hadoop QA commented on HDFS-8053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  24m 32s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   9m 11s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 13s | The applied patch generated  
267 new checkstyle issues (total was 24, now 291). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 49s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 50s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 30s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m 33s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 32s | Tests passed in 
hadoop-hdfs-client. |
| | | 221m 25s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.security.TestPermission |
|   | hadoop.fs.viewfs.TestViewFsHdfs |
|   | hadoop.hdfs.web.TestWebHDFSXAttr |
|   | hadoop.fs.TestWebHdfsFileContextMainOperations |
|   | hadoop.fs.TestGlobPaths |
|   | hadoop.fs.loadGenerator.TestLoadGenerator |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.fs.viewfs.TestViewFileSystemWithAcls |
|   | hadoop.fs.contract.hdfs.TestHDFSContractConcat |
|   | hadoop.fs.TestSymlinkHdfsDisable |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory |
|   | hadoop.cli.TestDeleteCLI |
|   | hadoop.fs.viewfs.TestViewFsWithXAttrs |
|   | hadoop.fs.viewfs.TestViewFsDefaultValue |
|   | hadoop.fs.contract.hdfs.TestHDFSContractOpen |
|   | hadoop.tools.TestTools |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRename |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
|   | hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.security.TestRefreshUserMappings |
|   | hadoop.fs.viewfs.TestViewFileSystemWithXAttrs |
|   | hadoop.fs.TestFcHdfsSetUMask |
|   | hadoop.fs.shell.TestHdfsTextCommand |
|   | hadoop.TestGenericRefresh |
|   | hadoop.fs.TestSymlinkHdfsFileSystem |
|   | hadoop.fs.TestUrlStreamHandlerFactory |
|   | hadoop.fs.contract.hdfs.TestHDFSContractAppend |
|   | hadoop.fs.contract.hdfs.TestHDFSContractDelete |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.fs.TestSWebHdfsFileContextMainOperations |
|   | hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSeek |
|   | hadoop.fs.viewfs.TestViewFsWithAcls |
|   | hadoop.fs.TestHDFSFileContextMainOperations |
|   | hadoop.fs.TestResolveHdfsSymlink |
|   | hadoop.fs.TestSymlinkHdfsFileContext |
|   | hadoop.cli.TestXAttrCLI |
|   | hadoop.fs.TestUrlStreamHandler |
|   | hadoop.hdfs.TestClientBlockVerification |
|   | hadoop.security.TestPermissionSymlinks |
|   | hadoop.cli.TestAclCLI |
|   | hadoop.cli.TestCacheAdminCLI |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.hdfs.web.TestWebHdfsContentLength |
|   | hadoop.fs.viewfs.TestViewFileSystemHdfs |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSetTimes |
|   | hadoop.fs.viewfs.TestViewFsFileStatusHdfs |
|   | hadoop.hdfs.TestHDFSTrash |
|   | hadoop.tools.TestJMXGet |
|   | hadoop.fs.TestFcHdfsCreateMkdir |
|   | hadoop.fs.TestFcHdfsPermission |
|   | hadoop.fs.contract.hdfs.TestHDFSContractCreate |
|   | hadoop.net.TestNetworkTopology |
|   | hadoop.hdfs.server.namenode.TestFSNamesystem |
|   | hadoop.fs.TestEnhancedByteBufferAccess |
|   | hadoop.fs.permission.TestStickyBit |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode |
|   | org.apache.hadoop.hdfs.TestPread |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762416/HDFS-8053.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e65c5 |
| checkstyle 

[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908740#comment-14908740
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8519 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8519/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908733#comment-14908733
 ] 

Hadoop QA commented on HDFS-8053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m  2s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 28s | The applied patch generated  
270 new checkstyle issues (total was 24, now 294). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 146m 50s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 30s | Tests passed in 
hadoop-hdfs-client. |
| | | 198m 26s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.fs.TestUrlStreamHandler |
|   | hadoop.TestRefreshCallQueue |
|   | hadoop.hdfs.protocolPB.TestPBHelper |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.fs.viewfs.TestViewFsWithAcls |
|   | hadoop.tools.TestTools |
|   | hadoop.fs.contract.hdfs.TestHDFSContractDelete |
|   | hadoop.fs.TestFcHdfsSetUMask |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.fs.contract.hdfs.TestHDFSContractOpen |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.fs.contract.hdfs.TestHDFSContractAppend |
|   | hadoop.fs.TestSymlinkHdfsFileSystem |
|   | hadoop.fs.viewfs.TestViewFsDefaultValue |
|   | hadoop.fs.TestSymlinkHdfsFileContext |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestFSInputChecker |
|   | hadoop.cli.TestAclCLI |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSeek |
|   | hadoop.tracing.TestTracing |
|   | hadoop.fs.viewfs.TestViewFsWithXAttrs |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestDatanodeReport |
|   | hadoop.tools.TestJMXGet |
|   | hadoop.fs.contract.hdfs.TestHDFSContractCreate |
|   | hadoop.hdfs.TestWriteRead |
|   | hadoop.fs.TestEnhancedByteBufferAccess |
|   | hadoop.fs.TestFcHdfsPermission |
|   | hadoop.security.TestRefreshUserMappings |
|   | hadoop.fs.viewfs.TestViewFsHdfs |
|   | hadoop.fs.TestResolveHdfsSymlink |
|   | hadoop.tracing.TestTracingShortCircuitLocalRead |
|   | hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus |
|   | hadoop.tracing.TestTraceAdmin |
|   | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | hadoop.fs.TestWebHdfsFileContextMainOperations |
|   | hadoop.fs.TestFcHdfsCreateMkdir |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.fs.viewfs.TestViewFileSystemWithXAttrs |
|   | hadoop.fs.viewfs.TestViewFileSystemWithAcls |
|   | hadoop.security.TestPermission |
|   | hadoop.fs.contract.hdfs.TestHDFSContractConcat |
|   | hadoop.security.TestPermissionSymlinks |
|   | hadoop.net.TestNetworkTopology |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory |
|   | hadoop.cli.TestXAttrCLI |
|   | hadoop.hdfs.TestEncryptionZonesWithHA |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRename |
|   | hadoop.fs.permission.TestStickyBit |
|   | hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSetTimes |
|   | hadoop.hdfs.TestPipelines |
|   | hadoop.fs.loadGenerator.TestLoadGenerator |
|   | hadoop.fs.TestUrlStreamHandlerFactory |
|   | hadoop.cli.TestCacheAdminCLI |
|   | hadoop.fs.TestSWebHdfsFileContextMainOperations |
|   | hadoop.fs.TestHDFSFileContextMainOperations |
|   | hadoop.fs.TestGlobPaths |
|   | hadoop.fs.viewfs.TestViewFsFileStatusHdfs |
|   | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
|   | hadoop.TestGenericRefresh |
|   | hadoop.fs.shell.TestHdfsTextCommand |
|   | hadoop.cli.TestDeleteCLI |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.fs.TestSymlinkHdfsDisable |
|   | hadoop.fs.viewfs.TestViewFileSystemHdfs |
|   | hadoop.hdfs.TestDFSInotifyEventInputStream |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | 

[jira] [Commented] (HDFS-9142) Namenode Http address is not configured correctly for federated cluster in MiniDFSCluster

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908759#comment-14908759
 ] 

Hadoop QA commented on HDFS-9142:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   8m  7s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |  10m 14s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 28s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 32s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 56s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m 46s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 126m 22s | Tests failed in hadoop-hdfs. |
| | | 154m 31s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762437/HDFS-9142.v3.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 83e65c5 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12684/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12684/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12684/console |


This message was automatically generated.

> Namenode Http address is not configured correctly for federated cluster in 
> MiniDFSCluster
> -
>
> Key: HDFS-9142
> URL: https://issues.apache.org/jira/browse/HDFS-9142
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: HDFS-9142.v1.patch, HDFS-9142.v2.patch, 
> HDFS-9142.v3.patch
>
>
> When setting up simpleHAFederatedTopology in MiniDFSCluster, each Namenode 
> should have its own configuration object, and the configuration should have 
> "dfs.namenode.http-address--" set up correctly for 
> all  pair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9107:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908833#comment-14908833
 ] 

Hadoop QA commented on HDFS-9139:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | reexec |   0m  0s | dev-support patch detected. |
| {color:blue}0{color} | pre-patch |  19m 50s | Pre-patch trunk compilation is 
healthy. |
| {color:blue}0{color} | @author |   0m  0s | Skipping @author checks as 
test-patch has been patched. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 21 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 10s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | shellcheck |   0m 10s | There were no new shellcheck 
(v0.3.3) issues. |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   6m 47s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  42m 25s | Tests failed in hadoop-hdfs. |
| | |  96m 28s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
|   | hadoop.hdfs.server.namenode.TestProtectedDirectories |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.namenode.TestSecurityTokenEditLog |
|   | hadoop.hdfs.server.namenode.TestEditLogFileOutputStream |
|   | hadoop.hdfs.server.namenode.TestFileLimit |
|   | hadoop.hdfs.server.namenode.TestFileContextXAttr |
|   | hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys |
|   | hadoop.hdfs.server.namenode.TestNameNodeHttpServer |
|   | hadoop.hdfs.server.namenode.TestSecureNameNode |
|   | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
| Timed out tests | org.apache.hadoop.hdfs.server.mover.TestMover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762445/HDFS-9139.01.patch |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e99d0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12685/console |


This message was automatically generated.

> Enable parallel JUnit tests for HDFS Pre-commit 
> 
>
> Key: HDFS-9139
> URL: https://issues.apache.org/jira/browse/HDFS-9139
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9139.01.patch
>
>
> Forked from HADOOP-11984, 
> With the initial and significant work from [~cnauroth], this Jira is to track 
> and support parallel tests' run for HDFS Precommit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9120) Metric logging values are truncated in NN Metrics log.

2015-09-25 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908839#comment-14908839
 ] 

Arpit Agarwal commented on HDFS-9120:
-

Hi [~kanaka], that sounds fine to me.

bq. Also shall we consider flatten and log values from TabularData & 
CompositeData types which are ignored currently?
Good idea, we could add them to the default exclude list if they are too 
verbose.

> Metric logging values are truncated in NN Metrics log.
> --
>
> Key: HDFS-9120
> URL: https://issues.apache.org/jira/browse/HDFS-9120
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: logging
>Reporter: Archana T
>Assignee: Kanaka Kumar Avvaru
>
> In namenode-metrics.log when metric name value pair is more than 128 
> characters, it is truncated as below --
> Example for LiveNodes information is ---
> vi namenode-metrics.log
> {color:red}
> 2015-09-22 10:34:37,891 
> NameNodeInfo:LiveNodes={"host-10-xx-xxx-88:50076":{"infoAddr":"10.xx.xxx.88:0","infoSecureAddr":"10.xx.xxx.88:52100","xferaddr":"10.xx.xxx.88:50076","l...
> {color}
> Here complete information of metric value is not logged.
> etc information being displayed as "..."
> Silimarly for other metric values in NN metrics.
> where as DN metric logs complete metric values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9142) Namenode Http address is not configured correctly for federated cluster in MiniDFSCluster

2015-09-25 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908843#comment-14908843
 ] 

Siqi Li commented on HDFS-9142:
---

Test failures seem to be unrelated

> Namenode Http address is not configured correctly for federated cluster in 
> MiniDFSCluster
> -
>
> Key: HDFS-9142
> URL: https://issues.apache.org/jira/browse/HDFS-9142
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: HDFS-9142.v1.patch, HDFS-9142.v2.patch, 
> HDFS-9142.v3.patch
>
>
> When setting up simpleHAFederatedTopology in MiniDFSCluster, each Namenode 
> should have its own configuration object, and the configuration should have 
> "dfs.namenode.http-address--" set up correctly for 
> all  pair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9133:

   Resolution: Fixed
Fix Version/s: 2.8.0
   3.0.0
   Status: Resolved  (was: Patch Available)

LGTM. +1.

Thanks for the effort [~cmccabe] and [~hitliuyi].

> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908885#comment-14908885
 ] 

Hudson commented on HDFS-9132:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8521 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8521/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908887#comment-14908887
 ] 

Hudson commented on HDFS-9107:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8521 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8521/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908886#comment-14908886
 ] 

Hudson commented on HDFS-9133:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8521 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8521/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy

2015-09-25 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908908#comment-14908908
 ] 

Ming Ma commented on HDFS-8647:
---

Thanks [~brahmareddy]. Maybe we can move {{hasClusterEverBeenMultiRack}} from 
DatanodeManager to NetworkTopology? Then {{BlockPlacementPolicyDefault}}'s 
{{verifyBlockPlacement}} can ask {{clusterMap}} if the cluster has ever been 
multi rack. In that way, we completely remove the multi rack reference from 
BlockManager.

Regardless the approach, there is a behavior change for 
{{BlockPlacementPolicyDefault}}'s {{verifyBlockPlacement}}, which is used by 
fsck. When # of racks is reduced to 1, fsck used to return ok; but with the 
change, it will indicate it violates the rack policy. That should be ok.

Nits: could you please clean up the whitespace? Also the descriptions you added 
{{chooseReplicaToDelete}} don't match the parameter names.

> Abstract BlockManager's rack policy into BlockPlacementPolicy
> -
>
> Key: HDFS-8647
> URL: https://issues.apache.org/jira/browse/HDFS-8647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, 
> HDFS-8647-003.patch, HDFS-8647-004.patch
>
>
> Sometimes we want to have namenode use alternative block placement policy 
> such as upgrade domains in HDFS-7541.
> BlockManager has built-in assumption about rack policy in functions such as 
> useDelHint, blockHasEnoughRacks. That means when we have new block placement 
> policy, we need to modify BlockManager to account for the new policy. Ideally 
> BlockManager should ask BlockPlacementPolicy object instead. That will allow 
> us to provide new BlockPlacementPolicy without changing BlockManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908943#comment-14908943
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2385 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2385/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908946#comment-14908946
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2358 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2358/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908956#comment-14908956
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #418 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/418/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9148:
--
Attachment: hdfs-9148.patch

A pretty trivial change to assert text.

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Attachments: hdfs-9148.patch
>
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9148:
--
Attachment: (was: hdfs-9148.patch)

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Attachments: hdfs-9148.patch
>
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9112) Improve error message for Haadmin when multiple name service IDs are configured

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908898#comment-14908898
 ] 

Hudson commented on HDFS-9112:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1180 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1180/])
HDFS-9112. Improve error message for Haadmin when multiple name service IDs are 
configured. Contributed by Anu Engineer. (jing9: rev 
83e99d06d0e5a71888aab33e9ae47460e9f1231f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/NNHAServiceTarget.java


> Improve error message for Haadmin when multiple name service IDs are 
> configured
> ---
>
> Key: HDFS-9112
> URL: https://issues.apache.org/jira/browse/HDFS-9112
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-9112.001.patch, HDFS-9112.002.patch, 
> HDFS-9112.003.patch, HDFS-9112.004.patch
>
>
> In HDFS-6376 we supported a feature for distcp that allows multiple 
> NameService IDs to be specified so that we can copy from two HA enabled 
> clusters.
> That confuses haadmin command since we have a check in 
> DFSUtil#getNamenodeServiceAddr which fails if it finds more than 1 name in 
> that property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908930#comment-14908930
 ] 

Hudson commented on HDFS-9132:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #442 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/442/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908931#comment-14908931
 ] 

Hudson commented on HDFS-9133:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #442 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/442/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908939#comment-14908939
 ] 

Hudson commented on HDFS-9132:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #448 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/448/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908941#comment-14908941
 ] 

Hudson commented on HDFS-9107:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #448 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/448/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908940#comment-14908940
 ] 

Hudson commented on HDFS-9133:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #448 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/448/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)
Tony Wu created HDFS-9148:
-

 Summary: Incorrect assert message in 
TestWriteToReplica#testWriteToTemporary
 Key: HDFS-9148
 URL: https://issues.apache.org/jira/browse/HDFS-9148
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.1
Reporter: Tony Wu
Priority: Trivial


The following assert text in TestWriteToReplica#testWriteToTemporary is not 
correct:
{code:java}
  Assert.fail("createRbw() Should have removed the block with the older "
  + "genstamp and replaced it with the newer one: " + 
blocks[NON_EXISTENT]);
{code}

If the assert is triggered, it can only be due to an temporary replica already 
exists and has newer generation stamp. It should have nothing to do with 
createRbw().




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu reassigned HDFS-9148:
-

Assignee: Tony Wu

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9148:
--
Target Version/s: 3.0.0  (was: 3.0.0, 2.7.2)

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908959#comment-14908959
 ] 

Hadoop QA commented on HDFS-8873:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 57s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  1s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 26s | The applied patch generated  8 
new checkstyle issues (total was 439, now 439). |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 33s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 162m 38s | Tests failed in hadoop-hdfs. |
| | | 208m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestDFSClientFailover |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.TestDistributedFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762175/HDFS-8873.009.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e99d0 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12687/console |


This message was automatically generated.

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch, HDFS-8873.004.patch, HDFS-8873.005.patch, 
> HDFS-8873.006.patch, HDFS-8873.007.patch, HDFS-8873.008.patch, 
> HDFS-8873.009.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9148:
--
Issue Type: Improvement  (was: Bug)

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Tony Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9148:
--
Attachment: hdfs-9148.patch

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Attachments: hdfs-9148.patch
>
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9148) Incorrect assert message in TestWriteToReplica#testWriteToTemporary

2015-09-25 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908993#comment-14908993
 ] 

Daniel Templeton commented on HDFS-9148:


Looks good.  Thanks, [~twu].  Just add some parentheses after 
"createTemporary", i.e. createTemporary(), and you have my +1.

> Incorrect assert message in TestWriteToReplica#testWriteToTemporary
> ---
>
> Key: HDFS-9148
> URL: https://issues.apache.org/jira/browse/HDFS-9148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Attachments: hdfs-9148.patch
>
>
> The following assert text in TestWriteToReplica#testWriteToTemporary is not 
> correct:
> {code:java}
>   Assert.fail("createRbw() Should have removed the block with the older "
>   + "genstamp and replaced it with the newer one: " + 
> blocks[NON_EXISTENT]);
> {code}
> If the assert is triggered, it can only be due to an temporary replica 
> already exists and has newer generation stamp. It should have nothing to do 
> with createRbw().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8053) Move DFSIn/OutputStream and related classes to hadoop-hdfs-client

2015-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909003#comment-14909003
 ] 

Hadoop QA commented on HDFS-8053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  22m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   9m 12s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 17s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 54s | The applied patch generated  
268 new checkstyle issues (total was 24, now 292). |
| {color:red}-1{color} | whitespace |   0m  2s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 54s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 39s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m  7s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 37s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 213m 45s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 35s | Tests failed in 
hadoop-hdfs-client. |
| | | 272m 17s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.TestFSNamesystem |
| Timed out tests | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
| Failed build | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12762457/HDFS-8053.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 83e99d0 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12686/console |


This message was automatically generated.

> Move DFSIn/OutputStream and related classes to hadoop-hdfs-client
> -
>
> Key: HDFS-8053
> URL: https://issues.apache.org/jira/browse/HDFS-8053
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-8053.000.patch, HDFS-8053.001.patch, 
> HDFS-8053.002.patch, HDFS-8053.003.patch, HDFS-8053.004.patch
>
>
> This jira tracks the effort of moving the {{DFSInputStream}} and 
> {{DFSOutputSream}} classes from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} 
> module.
> Guidelines:
> * As the {{DFSClient}} is heavily coupled to these two classes, we should 
> move it together.
> * Related classes should be addressed in separate jiras if they're 
> independent and complex enough.
> * The checkstyle warnings can be addressed in [HDFS-8979 | 
> https://issues.apache.org/jira/browse/HDFS-8979]
> * Removing the _slf4j_ logger guards when calling {{LOG.debug()}} and 
> {{LOG.trace()}} can be addressed in [HDFS-8971 | 
> https://issues.apache.org/jira/browse/HDFS-8971].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909010#comment-14909010
 ] 

Hudson commented on HDFS-9133:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1181/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909011#comment-14909011
 ] 

Hudson commented on HDFS-9107:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1181/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909009#comment-14909009
 ] 

Hudson commented on HDFS-9132:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1181 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1181/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9132) Pass genstamp to ReplicaAccessorBuilder

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909017#comment-14909017
 ] 

Hudson commented on HDFS-9132:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2359 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2359/])
HDFS-9132. Pass genstamp to ReplicaAccessorBuilder. (Colin Patrick McCabe via 
Lei (Eddy) Xu) (lei: rev 5eb237d544fc8eeea85ac4bd4f7500edd49c8727)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessorBuilder.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java


> Pass genstamp to ReplicaAccessorBuilder
> ---
>
> Key: HDFS-9132
> URL: https://issues.apache.org/jira/browse/HDFS-9132
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9132.001.patch
>
>
> We should pass the desired genstamp of the block we want to read to 
> ExternalReplicaBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9107) Prevent NN's unrecoverable death spiral after full GC

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909019#comment-14909019
 ] 

Hudson commented on HDFS-9107:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2359 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2359/])
HDFS-9107. Prevent NN's unrecoverable death spiral after full GC (Daryn Sharp 
via Colin P. McCabe) (cmccabe: rev 4e7c6a653f108d44589f84d78a03d92ee0e8a3c3)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
Add HDFS-9107 to CHANGES.txt (cmccabe: rev 
878504dcaacdc1bea42ad571ad5f4e537c1d7167)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Prevent NN's unrecoverable death spiral after full GC
> -
>
> Key: HDFS-9107
> URL: https://issues.apache.org/jira/browse/HDFS-9107
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9107.patch, HDFS-9107.patch
>
>
> A full GC pause in the NN that exceeds the dead node interval can lead to an 
> infinite cycle of full GCs.  The most common situation that precipitates an 
> unrecoverable state is a network issue that temporarily cuts off multiple 
> racks.
> The NN wakes up and falsely starts marking nodes dead. This bloats the 
> replication queues which increases memory pressure. The replications create a 
> flurry of incremental block reports and a glut of over-replicated blocks.
> The "dead" nodes heartbeat within seconds. The NN forces a re-registration 
> which requires a full block report - more memory pressure. The NN now has to 
> invalidate all the over-replicated blocks. The extra blocks are added to 
> invalidation queues, tracked in an excess blocks map, etc - much more memory 
> pressure.
> All the memory pressure can push the NN into another full GC which repeats 
> the entire cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9133) ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF

2015-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909018#comment-14909018
 ] 

Hudson commented on HDFS-9133:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2359 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2359/])
HDFS-9133. ExternalBlockReader and ReplicaAccessor need to return -1 on read 
when at EOF. (Colin Patrick McCabe via Lei (Eddy) Xu) (lei: rev 
67b0e967f0e13eb6bed123fc7ba4cce0dcca198f)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReplicaAccessor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestExternalBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java


> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
> -
>
> Key: HDFS-9133
> URL: https://issues.apache.org/jira/browse/HDFS-9133
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9133.001.patch
>
>
> ExternalBlockReader and ReplicaAccessor need to return -1 on read when at 
> EOF, as per the JavaDoc in BlockReader.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >