[jira] [Updated] (HDFS-10320) Rack failures may result in NN terminate

2016-04-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10320:
-
Status: Patch Available  (was: Open)

> Rack failures may result in NN terminate
> 
>
> Key: HDFS-10320
> URL: https://issues.apache.org/jira/browse/HDFS-10320
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-10320.01.patch
>
>
> If there're rack failures which end up leaving only 1 rack available, 
> {{BlockPlacementPolicyDefault#chooseRandom}} may get 
> {{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
> which then throws all the way out to {{BlockManager}}'s 
> {{ReplicationMonitor}} thread and terminate the NN.
> Log:
> {noformat}
> 2016-02-24 09:22:01,514  WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-02-24 09:22:01,958  ERROR 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception. 
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to 
> find datanode (scope="" excludedScope="/rack_a5").
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3746)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3711)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1400)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1306)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3682)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3634)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251275#comment-15251275
 ] 

Colin Patrick McCabe edited comment on HDFS-10301 at 4/21/16 4:58 AM:
--

Thanks for the bug report.  This is a tricky one.

One small correction-- HDFS-7960 was not introduced as part of DataNode 
hotswap.  It was originally introduced to solve issues caused by HDFS-7575, 
although it fixed issues with hotswap as well.

It seems like we should be able to remove existing DataNode storage report RPCs 
with the old ID from the queue when we receive one with a new block report ID.  
This would also avoid a possible congestion collapse scenario caused by 
repeated retransmissions after the timeout.


was (Author: cmccabe):
Thanks for the bug report.  This is a tricky one.

One small correction-- HDFS-7960 was not introduced as part of DataNode 
hotswap.  It was originally introduced to solve issues caused by HDF-7575, 
although it fixed issues with hotswap as well.

It seems like we should be able to remove existing DataNode storage report RPCs 
with the old ID from the queue when we receive one with a new block report ID.  
This would also avoid a possible congestion collapse scenario caused by 
repeated retransmissions after the timeout.

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-10301.01.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251275#comment-15251275
 ] 

Colin Patrick McCabe commented on HDFS-10301:
-

Thanks for the bug report.  This is a tricky one.

One small correction-- HDFS-7960 was not introduced as part of DataNode 
hotswap.  It was originally introduced to solve issues caused by HDF-7575, 
although it fixed issues with hotswap as well.

It seems like we should be able to remove existing DataNode storage report RPCs 
with the old ID from the queue when we receive one with a new block report ID.  
This would also avoid a possible congestion collapse scenario caused by 
repeated retransmissions after the timeout.

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-10301.01.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

2016-04-20 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-10313:
-
Attachment: HDFS-10313.002.patch

Thanks [~yzhangal] for review. Update the latest for addressing the comments.

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch, HDFS-10313.002.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10207:
-
Attachment: HDFS-10207-HDFS-9000.008.patch

Patch v008 fixed two check style issues.

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch, 
> HDFS-10207-HDFS-9000.007.patch, HDFS-10207-HDFS-9000.008.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-20 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251220#comment-15251220
 ] 

Takanobu Asanuma commented on HDFS-8057:


I created the jira(HDFS-10321) for cleaning up checkstyle warnings.

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch, HDFS-8057.2.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10321) Clean up checkstyle warnings in BlockReader

2016-04-20 Thread Takanobu Asanuma (JIRA)
Takanobu Asanuma created HDFS-10321:
---

 Summary: Clean up checkstyle warnings in BlockReader
 Key: HDFS-10321
 URL: https://issues.apache.org/jira/browse/HDFS-10321
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma
Priority: Minor


There are some checkstyle warnings in BlockReader and related classes. We 
should clean up them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251217#comment-15251217
 ] 

Xiaobing Zhou commented on HDFS-10207:
--

Thanks [~xyao], this is output of patch v005, forget about this.

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch, 
> HDFS-10207-HDFS-9000.007.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-20 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-8057:
---
Attachment: HDFS-8057.2.patch

Thanks for the comment, Nicholas.

I uploaded a new patch. I used reflections in {{TestBlockReaderFactory}} to 
avoid findbugs.

The last reported errors are caused by HDFS-10265 and they are fixed now.

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch, HDFS-8057.2.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251177#comment-15251177
 ] 

Hadoop QA commented on HDFS-10207:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 32s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 8s 
{color} | {color:red} root: patch generated 2 new + 437 unchanged - 1 fixed = 
439 total (was 438) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 50s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 11s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 35s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 54m 0s 
{color} | {color:green} hadoop-hdfs in the 

[jira] [Commented] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251159#comment-15251159
 ] 

Hadoop QA commented on HDFS-10224:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 49s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s 
{color} | {color:red} root: patch generated 9 new + 144 unchanged - 2 fixed = 
153 total (was 146) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
53s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 4m 56s 
{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.8.0_77 with 
JDK v1.8.0_77 generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 8m 30s 
{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client-jdk1.7.0_95 with 
JDK v1.7.0_95 generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 36s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 23s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK 

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251150#comment-15251150
 ] 

Walter Su commented on HDFS-10220:
--

1. isMaxFilesCheckedToReleaseLease is not requirted to be a function.
2. repeat [~vinayrpet] said, removeFilesInLease(leaseToCheck, removing); may 
not be required.
3. The LOG.warn("..") is kind of verbose.
4. I think the config should keep inside. It's about implemention detail. The 
re-check interval is 2s and is hard-coded too. Besides, It's too complicated 
for user to pick a right value. Instead of counting the files, I prefer 
counting the time. If it holds the lock for too long, log a warning and break 
out for a while.
5. btw, HDFS-9311 should solve this issue.


> Namenode failover due to too long loking in LeaseManager.Monitor
> 
>
> Key: HDFS-10220
> URL: https://issues.apache.org/jira/browse/HDFS-10220
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Nicolas Fraison
>Assignee: Nicolas Fraison
>Priority: Minor
> Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, 
> HADOOP-10220.003.patch, HADOOP-10220.004.patch, threaddump_zkfc.txt
>
>
> I have faced a namenode failover due to unresponsive namenode detected by the 
> zkfc with lot's of WARN messages (5 millions) like this one:
> _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All 
> existing blocks are COMPLETE, lease removed, file closed._
> On the threaddump taken by the zkfc there are lots of thread blocked due to a 
> lock.
> Looking at the code, there are a lock taken by the LeaseManager.Monitor when 
> some lease must be released. Due to the really big number of lease to be 
> released the namenode has taken too many times to release them blocking all 
> other tasks and making the zkfc thinking that the namenode was not 
> available/stuck.
> The idea of this patch is to limit the number of leased released each time we 
> check for lease so the lock won't be taken for a too long time period.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-04-20 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251140#comment-15251140
 ] 

Weiwei Yang commented on HDFS-6489:
---

[~raviprak] Thanks for looking at this. 

#1 Yes, this issue can be reproduced with appending same file lots of times 
(each time we close the stream), and also appending different files. The real 
thing matters is you use append API quite some times in a short time window.

#2 I was proposing to wait DU thread to refresh only when on a datanode, it is 
found the space is not enough for an append operation, this only happens at the 
time when that wait benefits (rather than fail). And once the space usage is 
updated, you would not need to wait for sometime until the problem comes up 
again. I'd love to know if you have any alternative approach.

I'll upload a patch that can apply to latest trunk shortly. Thanks for looking 
into this.

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251041#comment-15251041
 ] 

Hadoop QA commented on HDFS-10175:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 42s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-common-project/hadoop-common: patch generated 29 
new + 131 unchanged - 0 fixed = 160 total (was 131) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s 
{color} | {color:red} hadoop-common-project/hadoop-common generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 37s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 44s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 0s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common-project/hadoop-common |
|  |  Should 
org.apache.hadoop.fs.FileSystemStorageStatistics$LongStatisticIterator be a 
_static_ inner class?  At FileSystemStorageStatistics.java:inner class?  At 
FileSystemStorageStatistics.java:[lines 50-74] |
| JDK v1.8.0_77 Failed junit tests | hadoop.fs.TestFilterFileSystem |
|   | 

[jira] [Commented] (HDFS-10297) Increase default balance bandwidth and concurrent moves

2016-04-20 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251006#comment-15251006
 ] 

John Zhuge commented on HDFS-10297:
---

Could someone kindly review this simple patch?

> Increase default balance bandwidth and concurrent moves
> ---
>
> Key: HDFS-10297
> URL: https://issues.apache.org/jira/browse/HDFS-10297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10297.001.patch, HDFS-10297.002.patch, 
> HDFS-10297.003.patch
>
>
> Adjust the default values to better support the current level of customer 
> host and network configurations.
> Increase the default for property {{dfs.datanode.balance.bandwidthPerSec}} 
> from 1 to 10 MB. Apply to DN. 10 MB/s is about 10% of the GbE network.
> Increase the default for property 
> {{dfs.datanode.balance.max.concurrent.moves}} from 5 to 50. Apply to DN and 
> Balancer. The default number of DN receiver threads is 4096. The default 
> number of balancer mover threads is 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system

2016-04-20 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250993#comment-15250993
 ] 

Zhe Zhang commented on HDFS-9806:
-

Thanks [~chris.douglas] for proposing the work and [~virajith] for the 
HDFS-9809 patch.

I wonder if we can simplify the problem as "allowing some HDFS *files* to be 
provided external storage system", while still satisfying most requirements. In 
most production clusters, over 95% files are single-block. So having the 
staging logic on file (vs block) level should achieve most of the benefits of 
_using a small HDFS cluster to present a large amount of data_. I had some 
discussions with [~eddyxu] about the below ideas:

Let's first assume that we already have (or going to have) an 
{{o.a.h.FileSystem}} connector the external storage system -- S3, ADL, Aliyun, 
GCS etc. I did a very small PoC patch based on {{ViewFS}}. The idea is simply 
to have a {{smallFS}} and a {{bigFS}}, and using the {{smallFS}} (which is 
always an HDFS) as a cache. Writes will always land on {{smallFS}} first -- 
hence supporting HDFS-level consistency such as hflush. Different write-back 
policies {{bigFS}} can be added, such as write-through, write-back, and 30-sec 
flushing like Linux. Read operations will try {{smallFS}} first and then 
{{bigFS}} on a miss (and also store a copy in {{smallFS}} on miss). If we want 
the strong guarantee of _using HDFS as the storage platform_, we can also stage 
the data into {{smallFS}} first and then serve to application, like Linux page 
cache.
{code}
public class CacheFileSystem extends FileSystem {
private final ChRootedFileSystem smallFS;
  private final ChRootedFileSystem bigFS;
  ...
  @Override
  public FileStatus getFileStatus(Path f) throws IOException {
try {
  return smallFS.getFileStatus(f);
} catch (FileNotFoundException e) {
  return bigFS.getFileStatus(f);
}
  }

  @Override
  public FSDataOutputStream create(Path f,
  FsPermission permission,
  boolean overwrite,
  int bufferSize,
  short replication,
  long blockSize,
  Progressable progress) throws IOException {
return smallFS.create(f, permission, overwrite,
bufferSize, replication, blockSize, progress);
  }
  ...
}
{code}

Of course if you want to present a {{DistributedFileSystem}} to applications, 
the logic will be more complex (e.g. more wrapping on the output from 
{{bigFS}}). But I think it's still simpler than breaking into NN and DN 
internals. This is basically how Alluxio / Tachyon handles the caching logic.

Thoughts?

> Allow HDFS block replicas to be provided by an external storage system
> --
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Douglas
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10320) Rack failures may result in NN terminate

2016-04-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10320:
-
Attachment: HDFS-10320.01.patch

Patch 1 logs an error when {{NetworkTopology#chooseRandom}} cannot choose any 
nodes, then BPPD will try to fall back on local rack.

I haven't come up with a decent unit test, since we'd want to test on BPPD's 
{{chooseRemoteRack}} to cover our change, but it's hard to reach said race 
condition in the test. {{NT#countNumOfAvailableNodes}} will return 0 if we fail 
the racks in the test beforehand, and mocking it to return non-zero will make 
the loop in {{BPPD#chooseRandom}} to never exit... Any comments are highly 
appreciated. Thanks!

> Rack failures may result in NN terminate
> 
>
> Key: HDFS-10320
> URL: https://issues.apache.org/jira/browse/HDFS-10320
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-10320.01.patch
>
>
> If there're rack failures which end up leaving only 1 rack available, 
> {{BlockPlacementPolicyDefault#chooseRandom}} may get 
> {{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
> which then throws all the way out to {{BlockManager}}'s 
> {{ReplicationMonitor}} thread and terminate the NN.
> Log:
> {noformat}
> 2016-02-24 09:22:01,514  WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-02-24 09:22:01,958  ERROR 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception. 
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to 
> find datanode (scope="" excludedScope="/rack_a5").
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3746)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3711)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1400)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1306)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3682)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3634)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10320) Rack failures may result in NN terminate

2016-04-20 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250924#comment-15250924
 ] 

Xiao Chen commented on HDFS-10320:
--

The failure is due to a race, since in {{BPPD#chooseRandom}} we calculate 
available nodes before the while loop.
This bug only happens under below condition:
# {{numOfAvailableNodes}} is calculated before the while loop
# Rack failure, only nodes left are on the same rack as the current replica. 
The occurrence we see is that the cluster only has 2 racks, and 1 rack failed.
# {{BPPD#chooseDataNode}} -> {{NetworkTopology#chooseRandom}}, current rack is 
in {{excludedScope}}, so no datanodes can be chosen.

IMHO, the fix would be to fall back to current rack and log a warning message - 
HDFS doesn't have other options but to replicate on the only rack alive. 
Administrator is expected to recover the failed rack(s).

> Rack failures may result in NN terminate
> 
>
> Key: HDFS-10320
> URL: https://issues.apache.org/jira/browse/HDFS-10320
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>
> If there're rack failures which end up leaving only 1 rack available, 
> {{BlockPlacementPolicyDefault#chooseRandom}} may get 
> {{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
> which then throws all the way out to {{BlockManager}}'s 
> {{ReplicationMonitor}} thread and terminate the NN.
> Log:
> {noformat}
> 2016-02-24 09:22:01,514  WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-02-24 09:22:01,958  ERROR 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception. 
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to 
> find datanode (scope="" excludedScope="/rack_a5").
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3746)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3711)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1400)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1306)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3682)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3634)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10319) Balancer should not try to pair storages with different types

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250910#comment-15250910
 ] 

Hadoop QA commented on HDFS-10319:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 0s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 6s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 143m 55s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.TestHFlush |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799810/h10319_20160420.patch 
|
| JIRA Issue | HDFS-10319 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  

[jira] [Updated] (HDFS-10175) add per-operation stats to FileSystem.Statistics

2016-04-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-10175:

Attachment: HDFS-10175.006.patch

I posted patch 006 as an example of what I was thinking about.  The idea would 
be that some FS subclasses like DistributedFileSystem would override 
{{FileSystem#getStorageStatistics}} to return a {{StorageStatistics}} object 
with more details.  In the case of {{DistributedFileSystem}} (HDFS) these would 
be mostly about the number of various RPCs that were done.  We would simply use 
{{volatile longs}} to store this information.

> add per-operation stats to FileSystem.Statistics
> 
>
> Key: HDFS-10175
> URL: https://issues.apache.org/jira/browse/HDFS-10175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Ram Venkatesh
>Assignee: Mingliang Liu
> Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10320) Rack failures may result in NN terminate

2016-04-20 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10320:
-
Description: 
If there're rack failures which end up leaving only 1 rack available, 
{{BlockPlacementPolicyDefault#chooseRandom}} may get 
{{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
which then throws all the way out to {{BlockManager}}'s {{ReplicationMonitor}} 
thread and terminate the NN.

Log:
{noformat}
2016-02-24 09:22:01,514  WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more 
information, please enable DEBUG log level on 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy

2016-02-24 09:22:01,958  ERROR 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor 
thread received Runtime exception. 
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to find 
datanode (scope="" excludedScope="/rack_a5").
at 
org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
at 
org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3746)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3711)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1400)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1306)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3682)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3634)
at java.lang.Thread.run(Thread.java:745)
{noformat}

> Rack failures may result in NN terminate
> 
>
> Key: HDFS-10320
> URL: https://issues.apache.org/jira/browse/HDFS-10320
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>
> If there're rack failures which end up leaving only 1 rack available, 
> {{BlockPlacementPolicyDefault#chooseRandom}} may get 
> {{InvalidTopologyException}} when calling {{NetworkTopology#chooseRandom}}, 
> which then throws all the way out to {{BlockManager}}'s 
> {{ReplicationMonitor}} thread and terminate the NN.
> Log:
> {noformat}
> 2016-02-24 09:22:01,514  WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-02-24 09:22:01,958  ERROR 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception. 
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Failed to 
> find datanode (scope="" excludedScope="/rack_a5").
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:729)
>   at 
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:694)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:635)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>   at 
> 

[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250903#comment-15250903
 ] 

Xiaoyu Yao commented on HDFS-10207:
---

There are 4 more checkstyle issues (in addition to the 3 imports) from previous 
Jenkins . Please fix those. Thanks!
{code}
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:284:
  private String IPC_CLIENT_RPC_BACKOFF_ENABLE;:18: Name 
'IPC_CLIENT_RPC_BACKOFF_ENABLE' must match pattern '^[a-z][a-zA-Z0-9]*$'.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:287:
  final TreeSet RECONFIGURABLE_PROPERTIES = Sets.newTreeSet(Lists:25: 
Name 'RECONFIGURABLE_PROPERTIES' must match pattern '^[a-z][a-zA-Z0-9]*$'.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:287:
  final TreeSet RECONFIGURABLE_PROPERTIES = Sets.newTreeSet(Lists:25: 
Variable 'RECONFIGURABLE_PROPERTIES' must be private and have accessor methods.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:2041:

.setHeartbeatRecheckInterval(DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_DEFAULT);: 
Line is longer than 80 characters (found 90).
{code}

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch, 
> HDFS-10207-HDFS-9000.007.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10320) Rack failures may result in NN terminate

2016-04-20 Thread Xiao Chen (JIRA)
Xiao Chen created HDFS-10320:


 Summary: Rack failures may result in NN terminate
 Key: HDFS-10320
 URL: https://issues.apache.org/jira/browse/HDFS-10320
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Xiao Chen
Assignee: Xiao Chen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250896#comment-15250896
 ] 

Hadoop QA commented on HDFS-10309:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 38s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 17s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 168m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799804/HDFS-10309.02.patch |
| JIRA Issue | HDFS-10309 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 49a4c63a98be 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 

[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250880#comment-15250880
 ] 

Hadoop QA commented on HDFS-10207:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 32s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s 
{color} | {color:red} root: patch generated 7 new + 417 unchanged - 0 fixed = 
424 total (was 417) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 37s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 31s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 54s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 48s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 184m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||

[jira] [Commented] (HDFS-10317) dfs.domain.socket.path is not set in TestShortCircuitLocalRead.testReadWithRemoteBlockReader

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250868#comment-15250868
 ] 

Xiaobing Zhou commented on HDFS-10317:
--

[~libo-intel] can you comment how to reproduce it? Thank you.

> dfs.domain.socket.path is not set in 
> TestShortCircuitLocalRead.testReadWithRemoteBlockReader
> 
>
> Key: HDFS-10317
> URL: https://issues.apache.org/jira/browse/HDFS-10317
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Li Bo
>
> org.apache.hadoop.HadoopIllegalArgumentException: The short-circuit local 
> reads feature is enabled but dfs.domain.socket.path is not set.
>   at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.(DomainSocketFactory.java:115)
>   at org.apache.hadoop.hdfs.ClientContext.(ClientContext.java:132)
>   at org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:157)
>   at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:358)
>   at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:275)
>   at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:266)
>   at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:258)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2466)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2512)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1632)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:844)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitReadWithRemoteBlockReader(TestShortCircuitLocalRead.java:608)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.testReadWithRemoteBlockReader(TestShortCircuitLocalRead.java:590)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-20 Thread Nicolas Fraison (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250841#comment-15250841
 ] 

Nicolas Fraison commented on HDFS-10220:


[~vinayrpet], there is one checkstyle issue remaining but on a line that is not 
changed by the patch and comments have been addressed.
Let me know if the patch can be merged.

> Namenode failover due to too long loking in LeaseManager.Monitor
> 
>
> Key: HDFS-10220
> URL: https://issues.apache.org/jira/browse/HDFS-10220
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Nicolas Fraison
>Assignee: Nicolas Fraison
>Priority: Minor
> Attachments: HADOOP-10220.001.patch, HADOOP-10220.002.patch, 
> HADOOP-10220.003.patch, HADOOP-10220.004.patch, threaddump_zkfc.txt
>
>
> I have faced a namenode failover due to unresponsive namenode detected by the 
> zkfc with lot's of WARN messages (5 millions) like this one:
> _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All 
> existing blocks are COMPLETE, lease removed, file closed._
> On the threaddump taken by the zkfc there are lots of thread blocked due to a 
> lock.
> Looking at the code, there are a lock taken by the LeaseManager.Monitor when 
> some lease must be released. Due to the really big number of lease to be 
> released the namenode has taken too many times to release them blocking all 
> other tasks and making the zkfc thinking that the namenode was not 
> available/stuck.
> The idea of this patch is to limit the number of leased released each time we 
> check for lease so the lock won't be taken for a too long time period.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9894) Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives

2016-04-20 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250821#comment-15250821
 ] 

Jing Zhao commented on HDFS-9894:
-

+1 pending Jenkins.

> Add unsetStoragePolicy API to FileContext/AbstractFileSystem and derivatives
> 
>
> Key: HDFS-9894
> URL: https://issues.apache.org/jira/browse/HDFS-9894
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>  Labels: 2.8.0
> Attachments: HDFS-9894.000.patch
>
>
> This is to augment FileContext/AbstractFileSystem and derivatives with newly 
> added API unsetStoragePolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250820#comment-15250820
 ] 

Xiaobing Zhou commented on HDFS-10207:
--

Patch v007 removed the unused imports. Thanks [~xyao] for review.

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch, 
> HDFS-10207-HDFS-9000.007.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10207:
-
Attachment: HDFS-10207-HDFS-9000.007.patch

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch, 
> HDFS-10207-HDFS-9000.007.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics

2016-04-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250807#comment-15250807
 ] 

Colin Patrick McCabe commented on HDFS-10175:
-

I thought about this a little bit more, and I don't think that 
{{FileSystem#Statistics#StatisticsData}} is the best place to add these new 
statistics.  There are a few reasons.

Firstly, the statistics that we're interested in are inherently 
filesystem-specific.  For HDFS, we're interested in the number of RPCs to the 
NameNode-- calls like primitiveCreate, getBytesWithFutureGS, or concat.  For 
something like s3a, we're interested in how many PUT and GET requests we've 
done to Amazon S3.  s3 doesn't even support genstamps or the concat operation.  
Local filesystems have their own operations which are important.

Secondly, the thread-local-data mechanism is not really that appropriate for 
most operations.  Thread-local data is a big performance win when reading or 
writing bytes of data from or to a stream, since most such operations don't 
involve making an RPC.  We have big client-side buffers which mean that most 
reads and writes can return immediately. In constrast, operations like mkdir, 
rename, delete, etc. always end up making at least one RPC, since these 
operations cannot be buffered on the client.  In that case, the CPU overhead of 
doing an atomic increment is negligable.  But the overhead of storing all that 
thread-local data is significant.

I think what we should do is add an API to the FileSystem and FileContext base 
classes, which different types of FS can implement as appropriate.

> add per-operation stats to FileSystem.Statistics
> 
>
> Key: HDFS-10175
> URL: https://issues.apache.org/jira/browse/HDFS-10175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Ram Venkatesh
>Assignee: Mingliang Liu
> Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250792#comment-15250792
 ] 

Hudson commented on HDFS-10312:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9637 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9637/])
HDFS-10312. Large block reports may fail to decode at NameNode due to 64 
(cnauroth: rev 63ac2db59af2b50e74dc892cae1dbc4d2e061423)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestLargeBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10311) libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250783#comment-15250783
 ] 

Hadoop QA commented on HDFS-10311:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
55s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 49s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 51s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 7s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 55s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 24s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799808/HDFS-10311.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-10311 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 331cded37231 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d8653c8 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_77 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| JDK v1.7.0_95  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15224/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15224/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket
> -
>
> Key: HDFS-10311
> URL: https://issues.apache.org/jira/browse/HDFS-10311
> 

[jira] [Updated] (HDFS-10311) libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket

2016-04-20 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10311:
---
Attachment: HDFS-10311.HDFS-8707.001.patch

New patch, cleaned some stuff up.

> libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket
> -
>
> Key: HDFS-10311
> URL: https://issues.apache.org/jira/browse/HDFS-10311
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10311.HDFS-8707.000.patch, 
> HDFS-10311.HDFS-8707.001.patch
>
>
> DataNodeConnectionImpl calls reset on the unique_ptr that references the 
> underlying asio::tcp::socket.  If this happens after the continuation 
> pipeline checks the cancel state but before asio uses the socket it will 
> segfault because unique_ptr::reset will explicitly change it's value to 
> nullptr.
> Cancel should only call shutdown() and close() on the socket but keep the 
> instance of it alive.  The socket can probably also be turned into a member 
> of DataNodeConnectionImpl to get rid of the unique pointer and simplify 
> things a bit.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250764#comment-15250764
 ] 

Xiaoyu Yao commented on HDFS-10207:
---

[~xiaobingo], thanks for the update. patch v006 looks good to me. 
+1 after the following unused import removed from Namenode.java

{code}
import org.apache.hadoop.ipc.RPC;
import java.util.Collections;
import java.util.Set;
{code}

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250763#comment-15250763
 ] 

Xiaobing Zhou commented on HDFS-10224:
--

Patch v002 removed all file system APIs, e.g. AsyncFileSystem.

> Implement an asynchronous DistributedFileSystem
> ---
>
> Key: HDFS-10224
> URL: https://issues.apache.org/jira/browse/HDFS-10224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10224-HDFS-9924.000.patch, 
> HDFS-10224-HDFS-9924.001.patch, HDFS-10224-HDFS-9924.002.patch, 
> HDFS-10224-and-HADOOP-12909.000.patch
>
>
> This is proposed to implement an asynchronous DistributedFileSystem based on 
> AsyncFileSystem APIs in HADOOP-12910. In addition, rename is implemented as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10224) Implement an asynchronous DistributedFileSystem

2016-04-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10224:
-
Attachment: HDFS-10224-HDFS-9924.002.patch

> Implement an asynchronous DistributedFileSystem
> ---
>
> Key: HDFS-10224
> URL: https://issues.apache.org/jira/browse/HDFS-10224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10224-HDFS-9924.000.patch, 
> HDFS-10224-HDFS-9924.001.patch, HDFS-10224-HDFS-9924.002.patch, 
> HDFS-10224-and-HADOOP-12909.000.patch
>
>
> This is proposed to implement an asynchronous DistributedFileSystem based on 
> AsyncFileSystem APIs in HADOOP-12910. In addition, rename is implemented as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-20 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-10264:
---
Target Version/s: 2.6.5

+1 from me too.
Could we please commit this to 2.6 branch as well.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Fix For: 2.7.3
>
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250714#comment-15250714
 ] 

Xiaobing Zhou commented on HDFS-10207:
--

patch v006 removed some unused imports, e.g.
{code}
- import static 
org.apache.hadoop.hdfs.DFSConfigKeys.DFS_NAMENODE_SERVICE_RPC_ADDRESS_KEY;
- import static 
org.apache.hadoop.hdfs.DFSConfigKeys.DFS_NAMENODE_LIFELINE_RPC_ADDRESS_KEY;
{code}

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10207:
-
Attachment: HDFS-10207-HDFS-9000.006.patch

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch, HDFS-10207-HDFS-9000.006.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-20 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-10312:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thank you for the reviews anyone.  The test failures were unrelated.  I 
corrected the whitespace warning.  I have committed this to trunk, branch-2 and 
branch-2.8.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-04-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250640#comment-15250640
 ] 

Ravi Prakash commented on HDFS-6489:


>From Bogdan's code (thanks for that :-) ) I see its not #1 in my comment 
>above, but #2 that is causing the problem. I'll check to see if there is a 
>workaround

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-04-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250626#comment-15250626
 ] 

Ravi Prakash commented on HDFS-6489:


Unfortunately I don't think your approach will work Weiwei! The du thread takes 
really long times. On a single datanode there may be millions of files and thus 
the du takes a while to complete.
I would concur with Andrew's opinion *unless* the following is happening
1. Appending to the same file multiple times without closing is causing DFS 
usage to jump by block size every time. It seems [~andrew.wang] that users are 
reporting such behavior. It may be worth a look
2. We are rejecting writes on the datanodes based on (possibly outdated) 
information from the DU thread. If we are going to wait for the DU thread to 
update available space before writing, we may be rejecting for a long time.
Please let me know if I somehow misunderstand the issue / symptoms.

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9670) DistCp throws NPE when source is root

2016-04-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250617#comment-15250617
 ] 

Yongjun Zhang commented on HDFS-9670:
-

Thanks for the new rev [~jzhuge], 

LGTM, +1 and I will commit soon. Sorry for the delayed review.



> DistCp throws NPE when source is root
> -
>
> Key: HDFS-9670
> URL: https://issues.apache.org/jira/browse/HDFS-9670
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: John Zhuge
>  Labels: supportability
> Fix For: 2.8.0
>
> Attachments: HDFS-9670.001.patch, HDFS-9670.002.patch
>
>
> Symptom:
> {quote}
> [root@vb0724 ~]# hadoop distcp hdfs://X:8020/ hdfs://Y:8020/
> 16/01/20 11:33:33 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', 
> copyStrategy='uniformsize', sourceFileListing=null, 
> sourcePaths=[hdfs://X:8020/], targetPath=hdfs://Y:8020/, 
> targetPathExists=true, preserveRawXattrs=false, filtersFile='null'}
> 16/01/20 11:33:33 INFO client.RMProxy: Connecting to ResourceManager at Z:8032
> 16/01/20 11:33:33 ERROR tools.DistCp: Exception encountered 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.tools.util.DistCpUtils.getRelativePath(DistCpUtils.java:144)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.writeToFileListing(SimpleCopyListing.java:598)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.writeToFileListingRoot(SimpleCopyListing.java:583)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:313)
>   at 
> org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>   at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:122)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:429)
> {quote}
> Relevant code:
> {code}
>   private Path computeSourceRootPath(FileStatus sourceStatus,
>  DistCpOptions options) throws 
> IOException {
> Path target = options.getTargetPath();
> FileSystem targetFS = target.getFileSystem(getConf());
> final boolean targetPathExists = options.getTargetPathExists();
> boolean solitaryFile = options.getSourcePaths().size() == 1
> && 
> !sourceStatus.isDirectory();
> if (solitaryFile) {
>   if (targetFS.isFile(target) || !targetPathExists) {
> return sourceStatus.getPath();
>   } else {
> return sourceStatus.getPath().getParent();
>   }
> } else {
>   boolean specialHandling = (options.getSourcePaths().size() == 1 && 
> !targetPathExists) ||
>   options.shouldSyncFolder() || options.shouldOverwrite();
>   return specialHandling && sourceStatus.isDirectory() ? 
> sourceStatus.getPath() :
>   sourceStatus.getPath().getParent();
> }
>   }
> {code}
> We can see that it could return NULL at the end when doing 
> {{sourceStatus.getPath().getParent()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10319) Balancer should not try to pair storages with different types

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-10319:
---
Status: Patch Available  (was: Open)

> Balancer should not try to pair storages with different types
> -
>
> Key: HDFS-10319
> URL: https://issues.apache.org/jira/browse/HDFS-10319
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h10319_20160420.patch
>
>
> This is a performance bug – Balancer may pair a source datanode and a target 
> datanode with different storage types. Fortunately, it will fail schedule any 
> blocks in such pair since it will find out that the storage types are not 
> matched later on.
> The bug won't lead to incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10319) Balancer should not try to pair storages with different types

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-10319:
---
Attachment: h10319_20160420.patch

h10319_20160420.patch: matches storage types.

> Balancer should not try to pair storages with different types
> -
>
> Key: HDFS-10319
> URL: https://issues.apache.org/jira/browse/HDFS-10319
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h10319_20160420.patch
>
>
> This is a performance bug – Balancer may pair a source datanode and a target 
> datanode with different storage types. Fortunately, it will fail schedule any 
> blocks in such pair since it will find out that the storage types are not 
> matched later on.
> The bug won't lead to incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

2016-04-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250592#comment-15250592
 ] 

Ravi Prakash commented on HDFS-6489:


Thanks for the patch [~cheersyang]! Could you please let me know which branch I 
should apply it against and how? I am seeing conflicts. Usually we start with 
patches towards trunk, and then after it gets committed to trunk, backport it 
to branch-2

> DFS Used space is not correct computed on frequent append operations
> 
>
> Key: HDFS-6489
> URL: https://issues.apache.org/jira/browse/HDFS-6489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0, 2.7.1, 2.7.2
>Reporter: stanley shi
>Assignee: Weiwei Yang
> Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  16G  2.9G   13G  20% /
> tmpfs 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1  97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-10309:
---
Priority: Minor  (was: Major)
Hadoop Flags: Reviewed

+1 the new patch looks good.  Thanks.

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10309.01.patch, HDFS-10309.02.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

2016-04-20 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250582#comment-15250582
 ] 

Jing Zhao commented on HDFS-10313:
--

bq. the old behavior is that we fallback to regular distcp if sync failed, my 
suggested change would change the old behavior so no fallback would happen, I 
believe it's safer, do you agree?

Yes, agree. Let's return some error msg here.

> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing

2016-04-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250584#comment-15250584
 ] 

Hudson commented on HDFS-10308:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9636 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9636/])
HDFS-10308. TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing (cmccabe: 
rev ad36fa6f42870c517526618a30204b443bfc6b5a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeRetryCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java


> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10319) Balancer should not try to pair storages with different types

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-10319:
--

 Summary: Balancer should not try to pair storages with different 
types
 Key: HDFS-10319
 URL: https://issues.apache.org/jira/browse/HDFS-10319
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer & mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


This is a performance bug – Balancer may pair a source datanode and a target 
datanode with different storage types. Fortunately, it will fail schedule any 
blocks in such pair since it will find out that the storage types are not 
matched later on.

The bug won't lead to incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10311) libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket

2016-04-20 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10311:
---
Status: Patch Available  (was: Open)

> libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket
> -
>
> Key: HDFS-10311
> URL: https://issues.apache.org/jira/browse/HDFS-10311
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10311.HDFS-8707.000.patch
>
>
> DataNodeConnectionImpl calls reset on the unique_ptr that references the 
> underlying asio::tcp::socket.  If this happens after the continuation 
> pipeline checks the cancel state but before asio uses the socket it will 
> segfault because unique_ptr::reset will explicitly change it's value to 
> nullptr.
> Cancel should only call shutdown() and close() on the socket but keep the 
> instance of it alive.  The socket can probably also be turned into a member 
> of DataNodeConnectionImpl to get rid of the unique pointer and simplify 
> things a bit.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10311) libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket

2016-04-20 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10311:
---
Attachment: HDFS-10311.HDFS-8707.000.patch

Fix for this bug:

-add lock guards to DataNodeConnectionImpl methods because underlying asio 
socket isn't thread safe
-Refactor the deleter used by DataNodeConnectionImpl, cancel can now call the 
same code for disconnecting but it won't do the delete
-socket deleter checks if the socket is open before running SafeDisconnect to 
avoid false positive errors.

Tested by running 1K threads; each doing reads in a busy loop.  Then wait 10 
seconds to make sure the FS is connected and files opened properly before 
calling hdfsCancel on all file handles.  They all stop with no segfaults and 
return -1 as expected.

> libhdfs++: DatanodeConnection::Cancel should not delete the underlying socket
> -
>
> Key: HDFS-10311
> URL: https://issues.apache.org/jira/browse/HDFS-10311
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10311.HDFS-8707.000.patch
>
>
> DataNodeConnectionImpl calls reset on the unique_ptr that references the 
> underlying asio::tcp::socket.  If this happens after the continuation 
> pipeline checks the cancel state but before asio uses the socket it will 
> segfault because unique_ptr::reset will explicitly change it's value to 
> nullptr.
> Cancel should only call shutdown() and close() on the socket but keep the 
> instance of it alive.  The socket can probably also be turned into a member 
> of DataNodeConnectionImpl to get rid of the unique pointer and simplify 
> things a bit.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Amit Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Anand updated HDFS-10309:
--
Attachment: HDFS-10309.02.patch

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
> Fix For: 2.8.0
>
> Attachments: HDFS-10309.01.patch, HDFS-10309.02.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

2016-04-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250521#comment-15250521
 ] 

Yongjun Zhang edited comment on HDFS-10313 at 4/20/16 7:03 PM:
---

In addition, there is a method 
{{getSourceSnapshotPath(Path sourceDir, String snapshotName)}} you can use to 
replace two lines of code that compute snapshot path.



was (Author: yzhangal):
BTW, there is a method 
{{getSourceSnapshotPath(Path sourceDir, String snapshotName)}} you can use to 
replace two lines of code that compute snapshot path.


> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

2016-04-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250521#comment-15250521
 ] 

Yongjun Zhang commented on HDFS-10313:
--

BTW, there is a method 
{{getSourceSnapshotPath(Path sourceDir, String snapshotName)}} you can use to 
replace two lines of code that compute snapshot path.


> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250520#comment-15250520
 ] 

Xiaobing Zhou commented on HDFS-10207:
--

Thanks [~xyao] for reviews. v005 addressed all your comments except 

bq. 2. Do we want to keep the unmodifieableList to avoid creating new list upon 
each NamenodeRpcServer#listReconfigurableProperties() call?

TreeSet style mutable RECONFIGURABLE_PROPERTIES is needed since
1. IPC_CLIENT_RPC_BACKOFF_ENABLE is initialized during NN instantiation, and 
then added to the collection.
2. to avoid duplicated entries as a result multiple NN instantiations.
3. some tests depend on order of reconfigurable properties in the collection.



> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing

2016-04-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250510#comment-15250510
 ] 

Rakesh R commented on HDFS-10308:
-

Thanks [~cmccabe], for reviewing and committing the patch.

> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10207) Support enable Hadoop IPC backoff without namenode restart

2016-04-20 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10207:
-
Attachment: HDFS-10207-HDFS-9000.005.patch

> Support enable Hadoop IPC backoff without namenode restart
> --
>
> Key: HDFS-10207
> URL: https://issues.apache.org/jira/browse/HDFS-10207
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10207-HDFS-9000.000.patch, 
> HDFS-10207-HDFS-9000.001.patch, HDFS-10207-HDFS-9000.002.patch, 
> HDFS-10207-HDFS-9000.003.patch, HDFS-10207-HDFS-9000.004.patch, 
> HDFS-10207-HDFS-9000.005.patch
>
>
> It will be useful to allow changing {{ipc.#port#.backoff.enable}} without a 
> namenode restart to protect namenode from being overloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10308) TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing

2016-04-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-10308:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

+1.

Committed to 2.8, thanks!

> TestRetryCacheWithHA#testRetryCacheOnStandbyNN failing
> --
>
> Key: HDFS-10308
> URL: https://issues.apache.org/jira/browse/HDFS-10308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10308-001.patch, HDFS-10308-002.patch
>
>
> Its failing with following exception
> {code}
> java.lang.AssertionError: expected:<25> but was:<26>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testRetryCacheOnStandbyNN(TestRetryCacheWithHA.java:169)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-20 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9869:
---
Attachment: HDFS-9869-006.patch

Attached another patch fixing {{TestHdfsConfigFields}} test case failure.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch, HDFS-9869-005.patch, 
> HDFS-9869-006.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250468#comment-15250468
 ] 

Colin Patrick McCabe commented on HDFS-10265:
-

Thanks, [~rakesh_r] and [~brahmareddy].  I will take a look at HDFS-10308.

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Fix For: 2.8.0
>
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10276) Different results for exist call for file.ext/name

2016-04-20 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250459#comment-15250459
 ] 

Colin Patrick McCabe commented on HDFS-10276:
-

bq. I think it's more reasonable to throw ParentNotDirectoryException rather 
than AccessControlException.

I agree.  This seems like it would be an incompatible change, though, so 
probably branch-3 only.

> Different results for exist call for file.ext/name
> --
>
> Key: HDFS-10276
> URL: https://issues.apache.org/jira/browse/HDFS-10276
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kevin Cox
>Assignee: Yuanbo Liu
>
> Given you have a file {{/file}} an existence check for the path 
> {{/file/whatever}} will give different responses for different 
> implementations of FileSystem.
> LocalFileSystem will return false while DistributedFileSystem will throw 
> {{org.apache.hadoop.security.AccessControlException: Permission denied: ..., 
> access=EXECUTE, ...}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250443#comment-15250443
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8057:
---

Thanks for checking the results.  Will wait for your next patch.

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250439#comment-15250439
 ] 

Tsz Wo Nicholas Sze commented on HDFS-10309:


Hi Amit, thanks for working on this.  Could you also change getBlocksSize, 
getBlocksMinBlockSize and maxSizeToMove to use getLongBytes(..)?

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
> Fix For: 2.8.0
>
> Attachments: HDFS-10309.01.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10313) Distcp does not check the order of snapshot names passed to -diff

2016-04-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250438#comment-15250438
 ] 

Yongjun Zhang commented on HDFS-10313:
--

Hi [~linyiqun],

Thanks a lot for working on this issue! 

I looked into your patch, it looks pretty good. I have a few comments, largely 
cosmetic things:

1. It may be better to say "Snapshot  should be newer than " in the 
following exception. Replace   with the real names.

 throw new HadoopIllegalArgumentException(
"The toSnapshot file should be newer than fromSnapshot file");

2.
{code}
   } catch (FileNotFoundException nfe) {
  DistCp.LOG.warn("The snapshot file not be found.", nfe);
}

{code}
We should return false here. Or maybe we simply throw 
{{InvalidInputException}}, with nfe as the cause. I actually think the latter 
is better.

3. In {{createAndSubmitJob()}} method,
{code}
   if (distCpSync.sync()) {
  createInputFileListingWithDiff(job, distCpSync);
} else {
  inputOptions.disableUsingDiff();
}
{code}
I'd suggest that in the else block, we don't disable using diff, and simply 
issue error with clear message, and throw {{InvalidInputException}}  to be 
caught at {{DistCp#run}} method, thus quitting DistCp.

Hi [~jingzhao], the old behavior is that we fallback to regular distcp if sync 
failed, my suggested change would change the old behavior so no fallback would 
happen, I believe it's safer, do you agree?

4.
{code}
public void testSyncOfTimeChecking() throws Exception {
{code}
Suggest to change the test name to {{testSyncSnapshotTimeStampChecking()}}

Thanks.


> Distcp does not check the order of snapshot names passed to -diff
> -
>
> Key: HDFS-10313
> URL: https://issues.apache.org/jira/browse/HDFS-10313
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Yongjun Zhang
>Assignee: Lin Yiqun
> Attachments: HDFS-10313.001.patch
>
>
> This jira is to propose adding a check to distcp, when {{-diff s1 s2}} is 
> passed, we need to ensure that s2 is newer than s1, otherwise, abort with a 
> informative error message.
> This is the result of my offline discussion with [~jingzhao] on HDFS-9820. 
> Thanks Jing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250323#comment-15250323
 ] 

Hadoop QA commented on HDFS-10318:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 43s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 54s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 133m 28s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799768/HDFS-10318.01.patch |
| JIRA Issue | HDFS-10318 |
| Optional Tests |  

[jira] [Commented] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250219#comment-15250219
 ] 

Hadoop QA commented on HDFS-9869:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 20 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 44s {color} 
| {color:red} root-jdk1.8.0_77 with JDK v1.8.0_77 generated 2 new + 737 
unchanged - 2 fixed = 739 total (was 739) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 26s 
{color} | {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 2 new + 734 
unchanged - 2 fixed = 736 total (was 736) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 42s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s 
{color} | {color:red} root: patch generated 3 new + 747 unchanged - 3 fixed = 
750 total (was 750) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 3s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 58s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 7s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit 

[jira] [Commented] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250193#comment-15250193
 ] 

Hadoop QA commented on HDFS-9890:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
12s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 43s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 28m 54s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_77 with JDK 
v1.8.0_77 generated 41 new + 29 unchanged - 0 fixed = 70 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 34m 46s {color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_95 with JDK 
v1.7.0_95 generated 41 new + 29 unchanged - 0 fixed = 70 total (was 29) {color} 
|
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 26s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 5s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 28s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799772/HDFS-9890.HDFS-8707.002.patch
 |
| JIRA Issue | HDFS-9890 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux e7dd95e3e176 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d8653c8 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_77 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| cc | hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_77: 

[jira] [Commented] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250176#comment-15250176
 ] 

Hadoop QA commented on HDFS-10309:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 22s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 105m 23s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 26s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 229m 10s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestDataTransferKeepalive |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | 

[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-04-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250095#comment-15250095
 ] 

Rakesh R commented on HDFS-9833:


Thank you [~drankye], I will soon come up with a proposal.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-04-20 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9890:
--
Attachment: HDFS-9890.HDFS-8707.002.patch

Uploaded a patch with some of the read pipeline error injection I had locally.

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Attachment: HDFS-10318.01.patch

[~kihwal] Could you please review my patch? Since we worked with this test case 
we would like to clean it up a bit. After that we will have some minor change. 
Now I just did not want to mix up the clean up changes with this error message 
issue.

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10318.01.patch, TestJMXGetFails.log
>
>
> When a metric has incorrect value the DFSTestUtil.waitForMetric waits 60 sec 
> then throws TimeoutException with thread diagnostic as error message so our 
> asserts will not reached. Please check [^TestJMXGetFails.log] for the error 
> message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Status: Patch Available  (was: Open)

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-10318.01.patch, TestJMXGetFails.log
>
>
> When a metric has incorrect value the DFSTestUtil.waitForMetric waits 60 sec 
> then throws TimeoutException with thread diagnostic as error message so our 
> asserts will not reached. Please check [^TestJMXGetFails.log] for the error 
> message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-04-20 Thread Xiaowei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaowei Zhu reassigned HDFS-9890:
-

Assignee: Xiaowei Zhu  (was: Stephen)

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Hadoop Flags:   (was: Reviewed)

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: TestJMXGetFails.log
>
>
> When a metric has incorrect value the DFSTestUtil.waitForMetric waits 60 sec 
> then throws TimeoutException with thread diagnostic as error message so our 
> asserts will not reached. Please check [^TestJMXGetFails.log] for the error 
> message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-04-20 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249965#comment-15249965
 ] 

Kai Zheng commented on HDFS-9833:
-

Per off-line discussion with [~rakeshr], he'd like to help with this and so 
reassigned. Thanks Rakesh for the taking!

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-04-20 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9833:

Assignee: Rakesh R  (was: Kai Zheng)

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249906#comment-15249906
 ] 

Chris Nauroth commented on HDFS-10312:
--

bq. With your patch, to come out the current case, ipc.maximum.data.length 
should be changed in both NN and DN side.

The slightly strange thing is that it seems the 64 MB enforcement by protobuf 
only happens at time of decoding a message, not at time of creating the 
message.  In my testing, I only saw problems on the server side consuming the 
message (the NameNode).  I'm not sure that it would be strictly required to 
make the configuration change on DataNodes, but there is also no harm in doing 
it that way.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249897#comment-15249897
 ] 

Rakesh R commented on HDFS-9869:


Thanks [~zhz], [~andrew.wang] for the advice. Attached new patch with the 
suggested changes.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch, HDFS-9869-005.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-20 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9869:
---
Attachment: HDFS-9869-005.patch

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch, HDFS-9869-005.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Amit Anand (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249743#comment-15249743
 ] 

Amit Anand commented on HDFS-10309:
---

After building the patched jar I was able to test it successfully:
{code}
hdfs@bcpc-vm3:~$ hdfs balancer
16/04/20 08:05:54 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
16/04/20 08:05:54 INFO balancer.Balancer: parameters = 
Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, 
#blockpools = 0, run during upgrade = false]
16/04/20 08:05:54 INFO balancer.Balancer: included nodes = []
16/04/20 08:05:54 INFO balancer.Balancer: excluded nodes = []
16/04/20 08:05:54 INFO balancer.Balancer: source nodes = []
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
16/04/20 08:05:55 INFO balancer.KeyManager: Block token params received from 
NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
16/04/20 08:05:55 INFO block.BlockTokenSecretManager: Setting block keys
16/04/20 08:05:55 INFO balancer.KeyManager: Update block keys every 2hrs, 
30mins, 0sec
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 540 
(default=540)
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
(default=1000)
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 
(default=200)
16/04/20 08:05:55 INFO balancer.Balancer: 
dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
2147483648 (default=2147483648)
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size 
= 10485760 (default=10485760)
16/04/20 08:05:55 INFO block.BlockTokenSecretManager: Setting block keys
16/04/20 08:05:55 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
10737418240 (default=10737418240)
16/04/20 08:05:55 INFO balancer.Balancer: dfs.blocksize = 134217728 
(default=134217728)
16/04/20 08:05:55 INFO net.NetworkTopology: Adding a new node: 
/default-rack/192.168.100.14:1004
16/04/20 08:05:55 INFO net.NetworkTopology: Adding a new node: 
/default-rack/192.168.100.15:1004
16/04/20 08:05:55 INFO net.NetworkTopology: Adding a new node: 
/default-rack/192.168.100.13:1004
16/04/20 08:05:55 INFO balancer.Balancer: 0 over-utilized: []
16/04/20 08:05:55 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
Apr 20, 2016 8:05:55 AM   0  0 B 0 B
   -1 B
Apr 20, 2016 8:05:56 AM  Balancing took 2.179 seconds
{code}

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
> Fix For: 2.8.0
>
> Attachments: HDFS-10309.01.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 

[jira] [Updated] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Amit Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Anand updated HDFS-10309:
--
Fix Version/s: 2.8.0
   Status: Patch Available  (was: In Progress)

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
> Fix For: 2.8.0
>
> Attachments: HDFS-10309.01.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Amit Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Anand updated HDFS-10309:
--
Attachment: HDFS-10309.01.patch

> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
> Attachments: HDFS-10309.01.patch
>
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-10309) HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), m(mega), g(giga)

2016-04-20 Thread Amit Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-10309 started by Amit Anand.
-
> HDFS Balancer doesn't honor dfs.blocksize value defined with suffix k(kilo), 
> m(mega), g(giga)
> -
>
> Key: HDFS-10309
> URL: https://issues.apache.org/jira/browse/HDFS-10309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.8.0
>Reporter: Amit Anand
>Assignee: Amit Anand
>
> While running HDFS Balancer I get error given below when {{dfs.blockSize}} is 
> defined with suffix {{k(kilo), m(mega), g(giga)}} in {{hdfs-site.xml}}. In my 
> deployment {{dfs.blocksize}} is set to {{128m}}. 
> {code}
> hdfs@bcpc-vm1:/home/ubuntu$ hdfs balancer
> 16/04/19 08:49:51 INFO balancer.Balancer: namenodes  = [hdfs://Test-Laptop]
> 16/04/19 08:49:51 INFO balancer.Balancer: parameters = 
> Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle 
> iteration = 5, #excluded nodes = 0, #included nodes = 0, #source 
> nodes = 0, #blockpools = 0, run during upgrade = false]
> 16/04/19 08:49:51 INFO balancer.Balancer: included nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: excluded nodes = []
> 16/04/19 08:49:51 INFO balancer.Balancer: source nodes = []
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 16/04/19 08:49:52 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 
> 540 (default=540)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 
> (default=1000)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 
> 200 (default=200)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 
> 2147483648 (default=2147483648)
> 16/04/19 08:49:52 INFO balancer.Balancer: 
> dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
> 16/04/19 08:49:52 INFO block.BlockTokenSecretManager: Setting block keys
> 16/04/19 08:49:52 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 
> 10737418240 (default=10737418240)
> Apr 19, 2016 8:49:52 AM  Balancing took 1.408 seconds
> 16/04/19 08:49:52 ERROR balancer.Balancer: Exiting balancer due an exception
> java.lang.NumberFormatException: For input string: "128m"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.parseLong(Long.java:631)
> at 
> org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.getLong(Balancer.java:221)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.(Balancer.java:281)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:660)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:774)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:903)
> {code}
> However, the workaround for this is to run {{hdfs balancer}} with passing 
> numeric value for {{dfs.blocksize}} or change your {{hdfs-site.xml}}.
> {code}
> hdfs balancer -Ddfs.blocksize=134217728
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249701#comment-15249701
 ] 

Hadoop QA commented on HDFS-5280:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 22s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
42 unchanged - 1 fixed = 43 total (was 43) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 135m 28s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 125m 57s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
41s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 293m 20s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | 

[jira] [Commented] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249649#comment-15249649
 ] 

Walter Su commented on HDFS-5280:
-

+1 for catching the exception. The same exception will cause {{BlockScanner}} 
to shutdown.
We should be cautious to catch any {{RuntimeException}}. Instead of add 
{{catch}} to the outside try-finally clause, how about just catch the exactly 
exception at the place where it's been threw.  Like what we did in 
{{FSNamesystem.java}}
{code}
 744   try {
 745  checksumType = DataChecksum.Type.valueOf(checksumTypeStr);
 746   } catch (IllegalArgumentException iae) {
 747  throw new IOException("Invalid checksum type in "
 748 + DFS_CHECKSUM_TYPE_KEY + ": " + checksumTypeStr);
 749   }
{code}

> Corrupted meta files on data nodes prevents DFClient from connecting to data 
> nodes and updating corruption status to name node.
> ---
>
> Key: HDFS-5280
> URL: https://issues.apache.org/jira/browse/HDFS-5280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs-client
>Affects Versions: 1.1.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha, 2.7.2
> Environment: Red hat enterprise 6.4
> Hadoop-2.1.0
>Reporter: Jinghui Wang
>Assignee: Andres Perez
> Attachments: HDFS-5280.patch
>
>
> Meta files being corrupted causes the DFSClient not able to connect to the 
> datanodes to access the blocks, so DFSClient never perform a read on the 
> block, which is what throws the ChecksumException when file blocks are 
> corrupted and report to the namenode to mark the block as corrupt.  Since the 
> client never got to that far, thus the file status remain as healthy and so 
> are all the blocks.
> To replicate the error, put a file onto HDFS.
> run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that 
> following output.
> FSCK started for path /tmp/bogus.csv at 11:33:29
> /tmp/bogus.csv 109 bytes, 1 block(s):  OK
> 0. blk_-4255166695856420554_5292 len=109 repl=3
> find the block/meta files for 4255166695856420554 by running 
> ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will 
> get the following output:
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta
> now corrupt the meta file by running 
> ssh datanode1.address "sed -i -e '1i 1234567891' 
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta" 
> now run hadoop fs -cat /tmp/bogus.csv
> will show the stack trace of DFSClient failing to connect to the data node 
> with the corrupted meta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Description: When a metric has incorrect value the 
DFSTestUtil.waitForMetric waits 60 sec then throws TimeoutException with thread 
diagnostic as error message so our asserts will not reached. Please check 
[^TestJMXGetFails.log] for the error message.  (was: It fails with 
java.util.concurrent.TimeoutException. Actually the problem here is that we 
expect 2 as NumOpenConnections metric but it is only 1. So the test waits 60 
sec then fails.

Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).)

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: TestJMXGetFails.log
>
>
> When a metric has incorrect value the DFSTestUtil.waitForMetric waits 60 sec 
> then throws TimeoutException with thread diagnostic as error message so our 
> asserts will not reached. Please check [^TestJMXGetFails.log] for the error 
> message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Attachment: TestJMXGetFails.log

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: TestJMXGetFails.log
>
>
> It fails with java.util.concurrent.TimeoutException. Actually the problem 
> here is that we expect 2 as NumOpenConnections metric but it is only 1. So 
> the test waits 60 sec then fails.
> Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10318) TestJMXGet hides the real error in case of test failure

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10318:

Summary: TestJMXGet hides the real error in case of test failure  (was: 
TestJMXGet:testNameNode() fails)

> TestJMXGet hides the real error in case of test failure
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
>
> It fails with java.util.concurrent.TimeoutException. Actually the problem 
> here is that we expect 2 as NumOpenConnections metric but it is only 1. So 
> the test waits 60 sec then fails.
> Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10318) TestJMXGet:testNameNode() fails

2016-04-20 Thread Andras Bokor (JIRA)
Andras Bokor created HDFS-10318:
---

 Summary: TestJMXGet:testNameNode() fails
 Key: HDFS-10318
 URL: https://issues.apache.org/jira/browse/HDFS-10318
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.8.0
Reporter: Andras Bokor
Assignee: Gergely Novák
Priority: Minor
 Fix For: 2.8.0


It fails with java.util.concurrent.TimeoutException. Actually the problem here 
is that we expect 2 as NumOpenConnections metric but it is only 1. So the test 
waits 60 sec then fails.

Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-10318) TestJMXGet:testNameNode() fails

2016-04-20 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor reassigned HDFS-10318:
---

Assignee: Andras Bokor  (was: Gergely Novák)

> TestJMXGet:testNameNode() fails
> ---
>
> Key: HDFS-10318
> URL: https://issues.apache.org/jira/browse/HDFS-10318
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Fix For: 2.8.0
>
>
> It fails with java.util.concurrent.TimeoutException. Actually the problem 
> here is that we expect 2 as NumOpenConnections metric but it is only 1. So 
> the test waits 60 sec then fails.
> Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249607#comment-15249607
 ] 

Hadoop QA commented on HDFS-8449:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 
61 unchanged - 0 fixed = 63 total (was 61) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 36s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 48s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 21s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.balancer.TestBalancer |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-10276) Different results for exist call for file.ext/name

2016-04-20 Thread Yuanbo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249555#comment-15249555
 ] 

Yuanbo Liu commented on HDFS-10276:
---

[~cmccabe] Thanks for your comment. Yes I agree with you that we should take 
unprivileged users as consider. But in this case, NameNode should prevent any 
further operation since it knows the "parent path" is a file, not a real path. 
I think it's more reasonable to throw {{ParentNotDirectoryException}} rather 
than {{AccessControlException}}. See the behavior of Linux as below:
{code}
[yuanbo@oc7702007844 ~]$ ls test/test1
ls: cannot access test/test1: Not a directory
{code}
One thing should be rectified is that {{getFileInfo}} in {{DFSClient}}  causes 
this problem instead of {{exists}} method.

> Different results for exist call for file.ext/name
> --
>
> Key: HDFS-10276
> URL: https://issues.apache.org/jira/browse/HDFS-10276
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kevin Cox
>Assignee: Yuanbo Liu
>
> Given you have a file {{/file}} an existence check for the path 
> {{/file/whatever}} will give different responses for different 
> implementations of FileSystem.
> LocalFileSystem will return false while DistributedFileSystem will throw 
> {{org.apache.hadoop.security.AccessControlException: Permission denied: ..., 
> access=EXECUTE, ...}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10317) dfs.domain.socket.path is not set in TestShortCircuitLocalRead.testReadWithRemoteBlockReader

2016-04-20 Thread Li Bo (JIRA)
Li Bo created HDFS-10317:


 Summary: dfs.domain.socket.path is not set in 
TestShortCircuitLocalRead.testReadWithRemoteBlockReader
 Key: HDFS-10317
 URL: https://issues.apache.org/jira/browse/HDFS-10317
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Li Bo


org.apache.hadoop.HadoopIllegalArgumentException: The short-circuit local reads 
feature is enabled but dfs.domain.socket.path is not set.
at 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.(DomainSocketFactory.java:115)
at org.apache.hadoop.hdfs.ClientContext.(ClientContext.java:132)
at org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:157)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:358)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:275)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:266)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:258)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2466)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2512)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1632)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:844)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitReadWithRemoteBlockReader(TestShortCircuitLocalRead.java:608)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.testReadWithRemoteBlockReader(TestShortCircuitLocalRead.java:590)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-20 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249525#comment-15249525
 ] 

Takanobu Asanuma commented on HDFS-8057:


Some comments for test results:
* I think checkstyle warns are same as the current codes. After resolving this 
jira, it is good to clean checkstyles about BlockReader.
* The new findbug comes from changing the variable to public. This variable is 
used in {{TestBlockReaderFactory}}. I will use the refrection there.
* The failed unit tests which are in {{TestRetryCacheWithHA}} and 
{{TestNamenodeRetryCache}} may be affected by the first patch. I am 
investigating the causes.

> Move BlockReader implementation to the client implementation package
> 
>
> Key: HDFS-8057
> URL: https://issues.apache.org/jira/browse/HDFS-8057
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Takanobu Asanuma
> Attachments: HDFS-8057.1.patch
>
>
> BlockReaderLocal, RemoteBlockReader, etc should be moved to 
> org.apache.hadoop.hdfs.client.impl.  We may as well rename RemoteBlockReader 
> to BlockReaderRemote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10312) Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.

2016-04-20 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249482#comment-15249482
 ] 

Brahma Reddy Battula commented on HDFS-10312:
-

You are right. It would be non-trivial for admin's to split the existing 
storage directory to multiple storage directories.

With your patch, to come out the current case, ipc.maximum.data.length should 
be changed in both NN and DN side. 
I am also fine with this approach.

> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> 
>
> Key: HDFS-10312
> URL: https://issues.apache.org/jira/browse/HDFS-10312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-10312.001.patch, HDFS-10312.002.patch, 
> HDFS-10312.003.patch, HDFS-10312.004.patch
>
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-04-20 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Attachment: HDFS-8449-005.patch

Thanks Kai's review.
There're several other ec related metrics to add,  so I think we can put the 
unit tests in  a single file at first  and consider moving them to other file 
at the end.

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch, 
> HDFS-8449-005.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8057) Move BlockReader implementation to the client implementation package

2016-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249390#comment-15249390
 ] 

Hadoop QA commented on HDFS-8057:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-hdfs-project: patch generated 72 new + 212 
unchanged - 71 fixed = 284 total (was 283) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 20s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 37s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense 

[jira] [Updated] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-20 Thread Andres Perez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-5280:
---
Attachment: HDFS-5280.patch

> Corrupted meta files on data nodes prevents DFClient from connecting to data 
> nodes and updating corruption status to name node.
> ---
>
> Key: HDFS-5280
> URL: https://issues.apache.org/jira/browse/HDFS-5280
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs-client
>Affects Versions: 1.1.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha, 2.7.2
> Environment: Red hat enterprise 6.4
> Hadoop-2.1.0
>Reporter: Jinghui Wang
>Assignee: Andres Perez
> Attachments: HDFS-5280.patch
>
>
> Meta files being corrupted causes the DFSClient not able to connect to the 
> datanodes to access the blocks, so DFSClient never perform a read on the 
> block, which is what throws the ChecksumException when file blocks are 
> corrupted and report to the namenode to mark the block as corrupt.  Since the 
> client never got to that far, thus the file status remain as healthy and so 
> are all the blocks.
> To replicate the error, put a file onto HDFS.
> run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that 
> following output.
> FSCK started for path /tmp/bogus.csv at 11:33:29
> /tmp/bogus.csv 109 bytes, 1 block(s):  OK
> 0. blk_-4255166695856420554_5292 len=109 repl=3
> find the block/meta files for 4255166695856420554 by running 
> ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will 
> get the following output:
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta
> now corrupt the meta file by running 
> ssh datanode1.address "sed -i -e '1i 1234567891' 
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta" 
> now run hadoop fs -cat /tmp/bogus.csv
> will show the stack trace of DFSClient failing to connect to the data node 
> with the corrupted meta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >