[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers

2017-04-25 Thread Mukul Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984221#comment-15984221
 ] 

Mukul Kumar Singh commented on HDFS-11580:
--

Thanks [~linyiqun] for the updated patch.

In the latest patch, Trace ID is being used to match a response to its 
corresponding request.
{code}
   if (request.getTraceID().equals(curResponse.getTraceID())) {
  response = curResponse;
} else {
  pendingResponses.put(curResponse.getTraceID(), curResponse);
  // Try to get response from pending responses map and remove the
  // response in map.
  response = pendingResponses.remove(request.getTraceID());
}
{code}

However, with Cblock, there are cases where TraceID is not set correctly. I 
feel that  TraceID should not be used to match a response to its corresponding 
request. Would a counter be a better parameter to use here ?
{code}
BlockWriterTask
---
  ContainerProtocolCalls.writeSmallFile(client, containerName,
  Long.toString(block.getBlockID()), data, "");
{code}

> Ozone: Support asynchronus client API for SCM and containers
> 
>
> Key: HDFS-11580
> URL: https://issues.apache.org/jira/browse/HDFS-11580
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Attachments: HDFS-11580-HDFS-7240.001.patch, 
> HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, 
> HDFS-11580-HDFS-7240.004.patch
>
>
> This is an umbrella JIRA that needs to support a set of APIs in Asynchronous 
> form.
> For containers -- or the datanode API currently supports a call 
> {{sendCommand}}. we need to build proper programming interface and support an 
> async interface.
> There is also a set of SCM API that clients can call, it would be nice to 
> support Async interface for those too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-04-25 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984209#comment-15984209
 ] 

Surendra Singh Lilhore commented on HDFS-5042:
--

We also faced the same problem. Can we recover this kind of block from namenode 
after getting block report? 
If reported block genstamp and size is matching with the namenode in memory 
metadata then NameNode can send a command to datanode to recover from wrong 
replica state.


> Completed files lost after power failure
> 
>
> Key: HDFS-5042
> URL: https://issues.apache.org/jira/browse/HDFS-5042
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: ext3 on CentOS 5.7 (kernel 2.6.18-274.el5)
>Reporter: Dave Latham
>Priority: Critical
>
> We suffered a cluster wide power failure after which HDFS lost data that it 
> had acknowledged as closed and complete.
> The client was HBase which compacted a set of HFiles into a new HFile, then 
> after closing the file successfully, deleted the previous versions of the 
> file.  The cluster then lost power, and when brought back up the newly 
> created file was marked CORRUPT.
> Based on reading the logs it looks like the replicas were created by the 
> DataNodes in the 'blocksBeingWritten' directory.  Then when the file was 
> closed they were moved to the 'current' directory.  After the power cycle 
> those replicas were again in the blocksBeingWritten directory of the 
> underlying file system (ext3).  When those DataNodes reported in to the 
> NameNode it deleted those replicas and lost the file.
> Some possible fixes could be having the DataNode fsync the directory(s) after 
> moving the block from blocksBeingWritten to current to ensure the rename is 
> durable or having the NameNode accept replicas from blocksBeingWritten under 
> certain circumstances.
> Log snippets from RS (RegionServer), NN (NameNode), DN (DataNode):
> {noformat}
> RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils: 
> Creating 
> file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  with permission=rwxrwxrwx
> NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
>  blk_1395839728632046111_357084589
> DN 2013-06-29 11:16:06,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block 
> blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: 
> /10.0.5.237:50010
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block 
> blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block 
> blk_1395839728632046111_357084589 terminating
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: Removing 
> lease on  file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  from client DFSClient_hb_rs_hs745,60020,1372470111932
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  is closed by DFSClient_hb_rs_hs745,60020,1372470111932
> RS 2013-06-29 11:16:11,393 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Renaming compacted file at 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  to 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
> RS 2013-06-29 11:16:11,505 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Completed major compaction of 7 file(s) in n of 
> users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into 
> 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; total size for store is 24.2m
> ---  CRASH, RESTART -
> NN 2013-06-29 12:01:19,743 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: addStoredBlock request received for 
> 

[jira] [Commented] (HDFS-9962) Erasure Coding: need a way to test multiple EC policies

2017-04-25 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984187#comment-15984187
 ] 

Takanobu Asanuma commented on HDFS-9962:


Hi [~Sammi], yes I still plan to do. Sorry for late.

> Erasure Coding: need a way to test multiple EC policies
> ---
>
> Key: HDFS-9962
> URL: https://issues.apache.org/jira/browse/HDFS-9962
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Takanobu Asanuma
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Now that we support multiple EC policies, we need a way test it to catch 
> potential issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11704) OzoneFileSystem: A Hadoop file system implementation for Ozone

2017-04-25 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-11704:


 Summary: OzoneFileSystem: A Hadoop file system implementation for 
Ozone
 Key: HDFS-11704
 URL: https://issues.apache.org/jira/browse/HDFS-11704
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: fs/ozone
Reporter: Mingliang Liu
Assignee: Mingliang Liu






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11703:
--
Status: Patch Available  (was: Open)

> [READ] Tests for ProvidedStorageMap
> ---
>
> Key: HDFS-11703
> URL: https://issues.apache.org/jira/browse/HDFS-11703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
> Attachments: HDFS-11703-HDFS-9806.001.patch, 
> HDFS-11703-HDFS-9806.002.patch
>
>
> Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11703:
--
Attachment: HDFS-11703-HDFS-9806.002.patch

Posting an updated patch with checkstyle errors fixed. 

> [READ] Tests for ProvidedStorageMap
> ---
>
> Key: HDFS-11703
> URL: https://issues.apache.org/jira/browse/HDFS-11703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
> Attachments: HDFS-11703-HDFS-9806.001.patch, 
> HDFS-11703-HDFS-9806.002.patch
>
>
> Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11703:
--
Status: Open  (was: Patch Available)

> [READ] Tests for ProvidedStorageMap
> ---
>
> Key: HDFS-11703
> URL: https://issues.apache.org/jira/browse/HDFS-11703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
> Attachments: HDFS-11703-HDFS-9806.001.patch
>
>
> Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984129#comment-15984129
 ] 

Hadoop QA commented on HDFS-11384:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
53s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 280 unchanged - 2 fixed = 284 total (was 282) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | HDFS-11384 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865051/HDFS-11384.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux beb7a0d3cf1c 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4ea2778 |
| Default Java | 1.8.0_121 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19201/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19201/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console 

[jira] [Updated] (HDFS-10675) [READ] Datanode support to read from external stores.

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-10675:
--
Summary: [READ] Datanode support to read from external stores.  (was: 
[HDFS-9806][READ] Datanode support to read from external stores.)

> [READ] Datanode support to read from external stores.
> -
>
> Key: HDFS-10675
> URL: https://issues.apache.org/jira/browse/HDFS-10675
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10675-HDFS-9806.001.patch, 
> HDFS-10675-HDFS-9806.002.patch, HDFS-10675-HDFS-9806.003.patch, 
> HDFS-10675-HDFS-9806.004.patch, HDFS-10675-HDFS-9806.005.patch, 
> HDFS-10675-HDFS-9806.006.patch, HDFS-10675-HDFS-9806.007.patch, 
> HDFS-10675-HDFS-9806.008.patch, HDFS-10675-HDFS-9806.009.patch
>
>
> This JIRA introduces a new {{PROVIDED}} {{StorageType}} to represent external 
> stores, along with enabling the Datanode to read from such stores using a 
> {{ProvidedReplica}} and a {{ProvidedVolume}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10675) [HDFS-9806][READ] Datanode support to read from external stores.

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-10675:
--
Summary: [HDFS-9806][READ] Datanode support to read from external stores.  
(was: [READ] Datanode support to read from external stores.)

> [HDFS-9806][READ] Datanode support to read from external stores.
> 
>
> Key: HDFS-10675
> URL: https://issues.apache.org/jira/browse/HDFS-10675
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-10675-HDFS-9806.001.patch, 
> HDFS-10675-HDFS-9806.002.patch, HDFS-10675-HDFS-9806.003.patch, 
> HDFS-10675-HDFS-9806.004.patch, HDFS-10675-HDFS-9806.005.patch, 
> HDFS-10675-HDFS-9806.006.patch, HDFS-10675-HDFS-9806.007.patch, 
> HDFS-10675-HDFS-9806.008.patch, HDFS-10675-HDFS-9806.009.patch
>
>
> This JIRA introduces a new {{PROVIDED}} {{StorageType}} to represent external 
> stores, along with enabling the Datanode to read from such stores using a 
> {{ProvidedReplica}} and a {{ProvidedVolume}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies

2017-04-25 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984106#comment-15984106
 ] 

Huafeng Wang commented on HDFS-11605:
-

Hi Kai, there is one check style issue left. It's the method length check in 
FSNameSystem but it's not introduced by my patch. Maybe we can just wait for 
the QA result.

> Allow user to customize and add new erasure code codecs and policies
> 
>
> Key: HDFS-11605
> URL: https://issues.apache.org/jira/browse/HDFS-11605
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, 
> HDFS-11605.003.patch, HDFS-11605.004.patch
>
>
> Based on the facility developed in HDFS-11604, this will develop necessary 
> CLI cmd to load an XML file and the results will be maintained in NameNode 
> {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with 
> {{SYS_POLICIES}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies

2017-04-25 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984101#comment-15984101
 ] 

Kai Zheng commented on HDFS-11605:
--

Thanks [~HuafengWang] for the update per off-line reviewing discussion.

Did you also fix the check styles?

> Allow user to customize and add new erasure code codecs and policies
> 
>
> Key: HDFS-11605
> URL: https://issues.apache.org/jira/browse/HDFS-11605
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, 
> HDFS-11605.003.patch, HDFS-11605.004.patch
>
>
> Based on the facility developed in HDFS-11604, this will develop necessary 
> CLI cmd to load an XML file and the results will be maintained in NameNode 
> {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with 
> {{SYS_POLICIES}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies

2017-04-25 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984088#comment-15984088
 ] 

Huafeng Wang commented on HDFS-11605:
-

My latest patch covers following parts:
1. Add a new RPC call which adds user defined EC policies, and returns the 
results of each add operation.
2. Correspondingly add a new command in ECAdmin.
3. Change ErasureCodingPolicyManager to a singleton-like class. 

> Allow user to customize and add new erasure code codecs and policies
> 
>
> Key: HDFS-11605
> URL: https://issues.apache.org/jira/browse/HDFS-11605
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, 
> HDFS-11605.003.patch, HDFS-11605.004.patch
>
>
> Based on the facility developed in HDFS-11604, this will develop necessary 
> CLI cmd to load an XML file and the results will be maintained in NameNode 
> {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with 
> {{SYS_POLICIES}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11605) Allow user to customize and add new erasure code codecs and policies

2017-04-25 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11605:

Attachment: HDFS-11605.004.patch

> Allow user to customize and add new erasure code codecs and policies
> 
>
> Key: HDFS-11605
> URL: https://issues.apache.org/jira/browse/HDFS-11605
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11605.001.patch, HDFS-11605.002.patch, 
> HDFS-11605.003.patch, HDFS-11605.004.patch
>
>
> Based on the facility developed in HDFS-11604, this will develop necessary 
> CLI cmd to load an XML file and the results will be maintained in NameNode 
> {{ErasureCodingPolicyManager}} as {{USER_POLICIES}} in line with 
> {{SYS_POLICIES}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984042#comment-15984042
 ] 

Hadoop QA commented on HDFS-11580:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
53s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
31s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
0s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} hadoop-hdfs-project: The patch generated 7 new + 
0 unchanged - 1 fixed = 7 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
8s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new 
+ 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  Possible null pointer dereference of response in 
org.apache.hadoop.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtos$ContainerCommandResponseProto)
  Dereferenced at ContainerProtocolCalls.java:response in 
org.apache.hadoop.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtos$ContainerCommandResponseProto)
  Dereferenced at ContainerProtocolCalls.java:[line 650] |
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.server.namenode.TestStartup |
| Timed out junit tests | 

[jira] [Commented] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984021#comment-15984021
 ] 

Hadoop QA commented on HDFS-11627:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
58s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:612578f |
| JIRA Issue | HDFS-11627 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12864888/HDFS-11627-HDFS-7240.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fc112f0abbbf 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / eae8c2a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19196/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19196/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19196/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Block Storage: Cblock cache should register with flusher to upload blocks to 
> containers
> ---
>
> Key: HDFS-11627
> URL: https://issues.apache.org/jira/browse/HDFS-11627
> Project: 

[jira] [Commented] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984013#comment-15984013
 ] 

Hadoop QA commented on HDFS-11702:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
26s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 
extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
12s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | HDFS-11702 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865017/HDFS-11702.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1155131d187c 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 475f933 |
| Default Java | 1.8.0_121 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19200/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
 |
| findbugs | 

[jira] [Updated] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike

2017-04-25 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-11384:
---
Attachment: HDFS-11384.009.patch

Added waitActive() after starting DataNodes.

> Add option for balancer to disperse getBlocks calls to avoid NameNode's 
> rpc.CallQueueLength spike
> -
>
> Key: HDFS-11384
> URL: https://issues.apache.org/jira/browse/HDFS-11384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.3
>Reporter: yunjiong zhao
>Assignee: Konstantin Shvachko
> Attachments: balancer.day.png, balancer.week.png, 
> HDFS-11384.001.patch, HDFS-11384.002.patch, HDFS-11384.003.patch, 
> HDFS-11384.004.patch, HDFS-11384.005.patch, HDFS-11384.006.patch, 
> HDFS-11384-007.patch, HDFS-11384.008.patch, HDFS-11384.009.patch
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes 
> will cause NameNode's rpc.CallQueueLength spike. We observed this situation 
> could cause Hbase cluster failure due to RegionServer's WAL timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984006#comment-15984006
 ] 

Hadoop QA commented on HDFS-11703:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
11s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
35s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-9806 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 11 new + 0 unchanged - 0 fixed = 11 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | HDFS-11703 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12865020/HDFS-11703-HDFS-9806.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 460af8747539 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-9806 / 76a72ae |
| Default Java | 1.8.0_121 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19198/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19198/artifact/patchprocess/patch-asflicense-problems.txt
 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 

[jira] [Commented] (HDFS-10631) Federation State Store ZooKeeper implementation

2017-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984000#comment-15984000
 ] 

Hadoop QA commented on HDFS-10631:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
27s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Redundant nullcheck of recordsToRemove, which is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.remove(Class,
 Query)  Redundant null check at StateStoreZooKeeperImpl.java:is known to be 
non-null in 
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.remove(Class,
 Query)  Redundant null check at StateStoreZooKeeperImpl.java:[line 315] |
|  |  Redundant nullcheck of znode, which is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.removeAll(Class)
  Redundant null check at StateStoreZooKeeperImpl.java:is known to be non-null 
in 
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.removeAll(Class)
  Redundant null check at StateStoreZooKeeperImpl.java:[line 344] |
|  |  Call to java.util.Map.equals(String) 
in 
org.apache.hadoop.hdfs.server.federation.store.records.BaseRecord.like(BaseRecord)
  At BaseRecord.java: At BaseRecord.java:[line 145] |
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  

[jira] [Updated] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers

2017-04-25 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11627:
-
Status: Patch Available  (was: Open)

> Block Storage: Cblock cache should register with flusher to upload blocks to 
> containers
> ---
>
> Key: HDFS-11627
> URL: https://issues.apache.org/jira/browse/HDFS-11627
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11627-HDFS-7240.001.patch, 
> HDFS-11627-HDFS-7240.002.patch
>
>
> Cblock cache should register with flusher to upload blocks to containers.
> Currently Container Cache flusher tries to write to the container even when 
> the CblockLocalCache pipelines are not registered with the flusher. 
> This will result in the Container writes to fail.
> CblockLocalCache should register with the flusher before accepting any blocks 
> for write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers

2017-04-25 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11627:
-
Status: Open  (was: Patch Available)

> Block Storage: Cblock cache should register with flusher to upload blocks to 
> containers
> ---
>
> Key: HDFS-11627
> URL: https://issues.apache.org/jira/browse/HDFS-11627
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11627-HDFS-7240.001.patch, 
> HDFS-11627-HDFS-7240.002.patch
>
>
> Cblock cache should register with flusher to upload blocks to containers.
> Currently Container Cache flusher tries to write to the container even when 
> the CblockLocalCache pipelines are not registered with the flusher. 
> This will result in the Container writes to fail.
> CblockLocalCache should register with the flusher before accepting any blocks 
> for write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI

2017-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983855#comment-15983855
 ] 

Hudson commented on HDFS-11691:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11628 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11628/])
HDFS-11691. Add a proper scheme to the datanode links in NN web UI. (jlowe: rev 
e4321ec84321672a714419278946fe1012daac71)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js


> Add a proper scheme to the datanode links in NN web UI
> --
>
> Key: HDFS-11691
> URL: https://issues.apache.org/jira/browse/HDFS-11691
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11691.patch
>
>
> On the datanodes page of the namenode web UI, the datanode links may not be 
> correct if the namenode is serving the page through http but https is also 
> enabled.  This is because {{dfshealth.js}} does not put a proper scheme in 
> front of the address.  It already determines whether the address is 
> non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what 
> it is currently setting.
> The existing mechanism would work for YARN and MAPRED, since they can only 
> serve one protocol, HTTP or HTTPS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11703:
--
Status: Patch Available  (was: Open)

> [READ] Tests for ProvidedStorageMap
> ---
>
> Key: HDFS-11703
> URL: https://issues.apache.org/jira/browse/HDFS-11703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
> Attachments: HDFS-11703-HDFS-9806.001.patch
>
>
> Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11703:
--
Attachment: HDFS-11703-HDFS-9806.001.patch

> [READ] Tests for ProvidedStorageMap
> ---
>
> Key: HDFS-11703
> URL: https://issues.apache.org/jira/browse/HDFS-11703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
> Attachments: HDFS-11703-HDFS-9806.001.patch
>
>
> Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11703) [READ] Tests for ProvidedStorageMap

2017-04-25 Thread Virajith Jalaparti (JIRA)
Virajith Jalaparti created HDFS-11703:
-

 Summary: [READ] Tests for ProvidedStorageMap
 Key: HDFS-11703
 URL: https://issues.apache.org/jira/browse/HDFS-11703
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Virajith Jalaparti


Add tests for the {{ProvidedStorageMap}} in the namenode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI

2017-04-25 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HDFS-11691:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.1
   3.0.0-alpha3
   2.9.0
   Status: Resolved  (was: Patch Available)

Thanks to [~kihwal] for the contribution and to [~cheersyang] for additional 
review!  I committed this to trunk, branch-2, branch-2.8, and branch-2.8.1.

> Add a proper scheme to the datanode links in NN web UI
> --
>
> Key: HDFS-11691
> URL: https://issues.apache.org/jira/browse/HDFS-11691
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11691.patch
>
>
> On the datanodes page of the namenode web UI, the datanode links may not be 
> correct if the namenode is serving the page through http but https is also 
> enabled.  This is because {{dfshealth.js}} does not put a proper scheme in 
> front of the address.  It already determines whether the address is 
> non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what 
> it is currently setting.
> The existing mechanism would work for YARN and MAPRED, since they can only 
> serve one protocol, HTTP or HTTPS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures

2017-04-25 Thread James Moore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Moore updated HDFS-11701:
---
Affects Version/s: (was: 2.7.0)
   2.6.0

> NPE from Unresolved Host causes permanent DFSInputStream failures
> -
>
> Key: HDFS-11701
> URL: https://issues.apache.org/jira/browse/HDFS-11701
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
> Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 
> 5.9.0
>Reporter: James Moore
>
> We recently encountered the following NPE due to the DFSInputStream storing 
> old cached block locations from hosts which could no longer resolve.
> {quote}
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
> ~HBase related stack frames trimmed~
> {quote}
> After investigating, the DFSInputStream appears to have been open for upwards 
> of 3-4 weeks and had cached block locations from decommissioned nodes that no 
> longer resolve in DNS and had been shutdown and removed from the cluster 2 
> weeks prior.  If the DFSInputStream had refreshed its block locations from 
> the name node, it would have received alternative block locations which would 
> not contain the decommissioned data nodes.  As the above NPE leaves the 
> non-resolving data node in the list of block locations the DFSInputStream 
> never refreshes the block locations and all attempts to open a BlockReader 
> for the given blocks will fail.
> In our case, we resolved the NPE by closing and re-opening every 
> DFSInputStream in the cluster to force a purge of the block locations cache.  
> Ideally, the DFSInputStream would re-fetch all block locations for a host 
> which can't be resolved in DNS or at least the blocks requested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10480) Add an admin command to list currently open files

2017-04-25 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983738#comment-15983738
 ] 

Rushabh S Shah commented on HDFS-10480:
---

bq. Just curious, what's the difference of this command with hdfs fsck / 
-openforwrite ?
The output will be the same. Just the time to reach that output is vastly 
different. :)
hdfs fsck / will crawl the whole filesystem.

> Add an admin command to list currently open files
> -
>
> Key: HDFS-10480
> URL: https://issues.apache.org/jira/browse/HDFS-10480
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Rushabh S Shah
> Attachments: HDFS-10480-trunk-1.patch, HDFS-10480-trunk.patch
>
>
> Currently there is no easy way to obtain the list of active leases or files 
> being written. It will be nice if we have an admin command to list open files 
> and their lease holders.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient

2017-04-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-11702:
--
Status: Patch Available  (was: Open)

> Remove indefinite caching of key provider uri in DFSClient
> --
>
> Key: HDFS-11702
> URL: https://issues.apache.org/jira/browse/HDFS-11702
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-11702.patch
>
>
> There is an indefinite caching of key provider uri in dfsclient.
> Relevant piece of code.
> {code:title=DFSClient.java|borderStyle=solid}
>   /**
>* The key provider uri is searched in the following order.
>* 1. If there is a mapping in Credential's secrets map for namenode uri.
>* 2. From namenode getServerDefaults rpc.
>* 3. Finally fallback to local conf.
>* @return keyProviderUri if found from either of above 3 cases,
>* null otherwise
>* @throws IOException
>*/
>   URI getKeyProviderUri() throws IOException {
> if (keyProviderUri != null) {
>   return keyProviderUri;
> }
> // Lookup the secret in credentials object for namenodeuri.
> Credentials credentials = ugi.getCredentials();
>...
>...
> {code}
> Once the key provider uri is set, it won't refresh the value even if the key 
> provider uri on namenode is changed.
> For long running clients like on oozie servers, this means we have to bounce 
> all the oozie servers to get the change reflected.
> After this change, the client will cache the value for an hour after which it 
> will issue getServerDefaults call and will refresh the key provider uri.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient

2017-04-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-11702:
--
Attachment: HDFS-11702.patch

Attaching a simple patch.

> Remove indefinite caching of key provider uri in DFSClient
> --
>
> Key: HDFS-11702
> URL: https://issues.apache.org/jira/browse/HDFS-11702
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-11702.patch
>
>
> There is an indefinite caching of key provider uri in dfsclient.
> Relevant piece of code.
> {code:title=DFSClient.java|borderStyle=solid}
>   /**
>* The key provider uri is searched in the following order.
>* 1. If there is a mapping in Credential's secrets map for namenode uri.
>* 2. From namenode getServerDefaults rpc.
>* 3. Finally fallback to local conf.
>* @return keyProviderUri if found from either of above 3 cases,
>* null otherwise
>* @throws IOException
>*/
>   URI getKeyProviderUri() throws IOException {
> if (keyProviderUri != null) {
>   return keyProviderUri;
> }
> // Lookup the secret in credentials object for namenodeuri.
> Credentials credentials = ugi.getCredentials();
>...
>...
> {code}
> Once the key provider uri is set, it won't refresh the value even if the key 
> provider uri on namenode is changed.
> For long running clients like on oozie servers, this means we have to bounce 
> all the oozie servers to get the change reflected.
> After this change, the client will cache the value for an hour after which it 
> will issue getServerDefaults call and will refresh the key provider uri.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient

2017-04-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-11702:
--
Description: 
There is an indefinite caching of key provider uri in dfsclient.
Relevant piece of code.
{code:title=DFSClient.java|borderStyle=solid}
  /**
   * The key provider uri is searched in the following order.
   * 1. If there is a mapping in Credential's secrets map for namenode uri.
   * 2. From namenode getServerDefaults rpc.
   * 3. Finally fallback to local conf.
   * @return keyProviderUri if found from either of above 3 cases,
   * null otherwise
   * @throws IOException
   */
  URI getKeyProviderUri() throws IOException {
if (keyProviderUri != null) {
  return keyProviderUri;
}
// Lookup the secret in credentials object for namenodeuri.
Credentials credentials = ugi.getCredentials();
   ...
   ...
{code}

Once the key provider uri is set, it won't refresh the value even if the key 
provider uri on namenode is changed.
For long running clients like on oozie servers, this means we have to bounce 
all the oozie servers to get the change reflected.
After this change, the client will cache the value for an hour after which it 
will issue getServerDefaults call and will refresh the key provider uri.

  was:
There is an indefinite caching of key provider uri in dfsclient.
Relevant piece of code.
{code:title=DFSClient.java|borderStyle=solid}
  /**
   * The key provider uri is searched in the following order.
   * 1. If there is a mapping in Credential's secrets map for namenode uri.
   * 2. From namenode getServerDefaults rpc.
   * 3. Finally fallback to local conf.
   * @return keyProviderUri if found from either of above 3 cases,
   * null otherwise
   * @throws IOException
   */
  URI getKeyProviderUri() throws IOException {
if (keyProviderUri != null) {
  return keyProviderUri;
}
{code}

Once the key provider uri is set, it won't refresh the value even if the key 
provider uri on namenode is changed.
For long running clients like on oozie servers, this means we have to bounce 
all the oozie servers to get the change reflected.
After this change, the client will cache the value for an hour after which it 
will issue getServerDefaults call and will refresh the key provider uri.


> Remove indefinite caching of key provider uri in DFSClient
> --
>
> Key: HDFS-11702
> URL: https://issues.apache.org/jira/browse/HDFS-11702
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> There is an indefinite caching of key provider uri in dfsclient.
> Relevant piece of code.
> {code:title=DFSClient.java|borderStyle=solid}
>   /**
>* The key provider uri is searched in the following order.
>* 1. If there is a mapping in Credential's secrets map for namenode uri.
>* 2. From namenode getServerDefaults rpc.
>* 3. Finally fallback to local conf.
>* @return keyProviderUri if found from either of above 3 cases,
>* null otherwise
>* @throws IOException
>*/
>   URI getKeyProviderUri() throws IOException {
> if (keyProviderUri != null) {
>   return keyProviderUri;
> }
> // Lookup the secret in credentials object for namenodeuri.
> Credentials credentials = ugi.getCredentials();
>...
>...
> {code}
> Once the key provider uri is set, it won't refresh the value even if the key 
> provider uri on namenode is changed.
> For long running clients like on oozie servers, this means we have to bounce 
> all the oozie servers to get the change reflected.
> After this change, the client will cache the value for an hour after which it 
> will issue getServerDefaults call and will refresh the key provider uri.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11702) Remove indefinite caching of key provider uri in DFSClient

2017-04-25 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created HDFS-11702:
-

 Summary: Remove indefinite caching of key provider uri in DFSClient
 Key: HDFS-11702
 URL: https://issues.apache.org/jira/browse/HDFS-11702
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah


There is an indefinite caching of key provider uri in dfsclient.
Relevant piece of code.
{code:title=DFSClient.java|borderStyle=solid}
  /**
   * The key provider uri is searched in the following order.
   * 1. If there is a mapping in Credential's secrets map for namenode uri.
   * 2. From namenode getServerDefaults rpc.
   * 3. Finally fallback to local conf.
   * @return keyProviderUri if found from either of above 3 cases,
   * null otherwise
   * @throws IOException
   */
  URI getKeyProviderUri() throws IOException {
if (keyProviderUri != null) {
  return keyProviderUri;
}
{code}

Once the key provider uri is set, it won't refresh the value even if the key 
provider uri on namenode is changed.
For long running clients like on oozie servers, this means we have to bounce 
all the oozie servers to get the change reflected.
After this change, the client will cache the value for an hour after which it 
will issue getServerDefaults call and will refresh the key provider uri.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2017-04-25 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983518#comment-15983518
 ] 

Wei-Chiu Chuang commented on HDFS-10799:


I got pinged about this patch recently a few times, so I gave it a deeper 
thought.

I am convinced now that it's not entirely NameNode's responsibility. Even after 
applying this patch, the iNotify client still needs to ensure its Kerberos 
ticket is valid, otherwise the next iNotify request will fail. One of our 
internal projects bumped into this issue, and after adding code to renew 
tickets periodically, this issue went away.

The only correct thing that NN should do, is to distinguish a inotify request 
v.s. a edit log flush request. For the former, it shouldn't print that scary 
message.

> NameNode should use loginUser(hdfs) to serve iNotify requests
> -
>
> Key: HDFS-10799
> URL: https://issues.apache.org/jira/browse/HDFS-10799
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
> Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10799.001.patch
>
>
> When a NameNode serves iNotify requests from a client, it verifies the client 
> has superuser permission and then uses the client's Kerberos principal to 
> read edits from journal nodes.
> However, if the client does not renew its tgt tickets, the connection from 
> NameNode to journal nodes may fail. In which case, the NameNode thinks the 
> edits are corrupt, and prints a scary error message:
> "During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!"
> However, the edits are actually good. NameNode _should not freak out when an 
> iNotify client's tgt ticket expires_.
> I think that an easy solution to this bug, is that after NameNode verifies 
> client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
> read edits. This will make sure the operation does not fail due to an expired 
> client ticket.
> Excerpt of related logs:
> {noformat}
> 2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 112 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
> IP:port] Call#73 Retry#0
> java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at 

[jira] [Commented] (HDFS-11627) Block Storage: Cblock cache should register with flusher to upload blocks to containers

2017-04-25 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983514#comment-15983514
 ] 

Chen Liang commented on HDFS-11627:
---

Thanks [~msingh] for the updates and the comments! v002 patch looks good to me, 
pending Jenkins.

> Block Storage: Cblock cache should register with flusher to upload blocks to 
> containers
> ---
>
> Key: HDFS-11627
> URL: https://issues.apache.org/jira/browse/HDFS-11627
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11627-HDFS-7240.001.patch, 
> HDFS-11627-HDFS-7240.002.patch
>
>
> Cblock cache should register with flusher to upload blocks to containers.
> Currently Container Cache flusher tries to write to the container even when 
> the CblockLocalCache pipelines are not registered with the flusher. 
> This will result in the Container writes to fail.
> CblockLocalCache should register with the flusher before accepting any blocks 
> for write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers

2017-04-25 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983505#comment-15983505
 ] 

Chen Liang commented on HDFS-11580:
---

Thanks [~linyiqun] for updating the patch!

in {{XceiverClientHandler.java}}

1. I'm not sure how {{CompletableFuture}} works here, so this is more like a 
question and everyone's thoughts are welcome: In {{sendCommandAsync}}, will 
{{supplyAsync}} be called by multiple threads at the same time? how about 
{{waitForResponse}}? If so, is this method thread-safe? (e.g. do we need 
protection for {{pendingResponses}}?)

in {{ContainerProtocolCalls.java}}
2. Actually I think [~msingh] brought up a critical point, which is that we 
need to be certain that the responses match the requests. This can be somewhat 
tricky, but can we add a unit test to verify this? I'm thinking of may be 
having a test that sends a number of async requests and see if they all got 
properly responded.

3. Make this one line?
{code}
ContainerCommandResponseProto response;
response = xceiverClient.sendCommand(request);
{code}

in {{XceiverClientRatis.java}}
4. {{// TODO: Implement the async interface.}}
Let's either file another JIRA to follow up, or throw an 
{{UnsupportedOperationException}} if we will not support async calls with 
Ratis. [~anu]?

> Ozone: Support asynchronus client API for SCM and containers
> 
>
> Key: HDFS-11580
> URL: https://issues.apache.org/jira/browse/HDFS-11580
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Attachments: HDFS-11580-HDFS-7240.001.patch, 
> HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, 
> HDFS-11580-HDFS-7240.004.patch
>
>
> This is an umbrella JIRA that needs to support a set of APIs in Asynchronous 
> form.
> For containers -- or the datanode API currently supports a call 
> {{sendCommand}}. we need to build proper programming interface and support an 
> async interface.
> There is also a set of SCM API that clients can call, it would be nice to 
> support Async interface for those too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10631) Federation State Store ZooKeeper implementation

2017-04-25 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated HDFS-10631:
---
Attachment: HDFS-10631-HDFS-10467-003.patch

* Updating to {{Logger}}
* Switching to {{Query}}

> Federation State Store ZooKeeper implementation
> ---
>
> Key: HDFS-10631
> URL: https://issues.apache.org/jira/browse/HDFS-10631
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Inigo Goiri
>Assignee: Jason Kace
> Attachments: HDFS-10631-HDFS-10467-001.patch, 
> HDFS-10631-HDFS-10467-002.patch, HDFS-10631-HDFS-10467-003.patch
>
>
> State Store implementation using ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10467) Router-based HDFS federation

2017-04-25 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983422#comment-15983422
 ] 

Inigo Goiri commented on HDFS-10467:


[~fabbri], the tasks in the current JIRA are the basic ones to get the 
Router-based federation working.
There are a bunch of them that we can add:
* Web interface
* Metrics system
* Router heartbeating
* Router safe mode
* Rebalancing

All these are already implemented and is running in our clusters.
There is a couple months ago version available at:
https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router
(I can update with the latest if needed.)

At this point is a matter of reviewing the code in the subtasks.
It's hard to give a time frame but having reviews; so any reviews on the 
subtasks is highly appreciated.

> Router-based HDFS federation
> 
>
> Key: HDFS-10467
> URL: https://issues.apache.org/jira/browse/HDFS-10467
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.1
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS 
> Router Federation.pdf, HDFS-Router-Federation-Prototype.patch
>
>
> Add a Router to provide a federated view of multiple HDFS clusters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10467) Router-based HDFS federation

2017-04-25 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983390#comment-15983390
 ] 

Aaron Fabbri commented on HDFS-10467:
-

Thanks for the update [~elgoiri].  I'm trying to get a feel for the overall 
progress of this.  Are there any work items that are not already covered in the 
subtasks here?  Any other details on how much work is left, or when you expect 
to have basic features completed, is welcomed.

> Router-based HDFS federation
> 
>
> Key: HDFS-10467
> URL: https://issues.apache.org/jira/browse/HDFS-10467
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.1
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: HDFS-10467.PoC.001.patch, HDFS-10467.PoC.patch, HDFS 
> Router Federation.pdf, HDFS-Router-Federation-Prototype.patch
>
>
> Add a Router to provide a federated view of multiple HDFS clusters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-7541) Upgrade Domains in HDFS

2017-04-25 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HDFS-7541:


Assignee: Kihwal Lee  (was: Ming Ma)

> Upgrade Domains in HDFS
> ---
>
> Key: HDFS-7541
> URL: https://issues.apache.org/jira/browse/HDFS-7541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Kihwal Lee
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-7541-2.patch, HDFS-7541.patch, 
> SupportforfastHDFSdatanoderollingupgrade.pdf, UpgradeDomains_design_v2.pdf, 
> UpgradeDomains_Design_v3.pdf
>
>
> Current HDFS DN rolling upgrade step requires sequential DN restart to 
> minimize the impact on data availability and read/write operations. The side 
> effect is longer upgrade duration for large clusters. This might be 
> acceptable for DN JVM quick restart to update hadoop code/configuration. 
> However, for OS upgrade that requires machine reboot, the overall upgrade 
> duration will be too long if we continue to do sequential DN rolling restart.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11700) testBackupNodePorts doesn't pass on Windows machine

2017-04-25 Thread Anbang Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983266#comment-15983266
 ] 

Anbang Hu commented on HDFS-11700:
--

More detailed investigation shows that after *Attempt 1* fails, in forcing 
{{rpcServer}} to stop, {{listener.doStop()}} in {{Server.stop()}} fails to 
close the socket properly. Tracing the code inside {{doStop}}:
{{acceptChannel.socket()}} returns a {{ServerSocketAdaptor}}
-> {{ServerSocketAdaptor.close()}}
-> {{ServerSocketChannelmpl.close()}}
-> {{AbstractInterruptibleChannel.close()}}
-> {{AbstractSelectableChannel.implCloseChannel()}}
-> {{ServerSocketChannelImpl.implCloseSelectableChannel()}}
-> {{SocketDispatcher.preClose()}}
-> {{preClose0(FileDescriptor var0)}}

The difference on Windows and Ubuntu is that after {{preClose0(FileDescriptor 
var0)}}, the previously used port becomes available on Ubuntu, but not on 
Windows. {{preClose0}} is native method and I am using Oracle Java 1.8.0_121.

> testBackupNodePorts doesn't pass on Windows machine
> ---
>
> Key: HDFS-11700
> URL: https://issues.apache.org/jira/browse/HDFS-11700
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Windows 10
>Reporter: Anbang Hu
>
> In TestHDFSServerPorts.testBackupNodePorts, there are two attempts at 
> starting backup node.
> *Attempt 1*:
> 1) It binds namenode backup address with 0:
> {quote}
> backup_config.set(DFSConfigKeys.DFS_NAMENODE_BACKUP_ADDRESS_KEY, THIS_HOST);
> {quote}
> 2) It sets namenode backup address with an available port X.
> 3) It fails rightfully due to using the same http address as active namenode.
> *Attempt 2*:
> 1) It tries to reuse port X as namenode backup address.
> 2) It fails to bind to X because Windows does not release port X properly 
> after *Attempt 1* fails.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10788) fsck NullPointerException when it encounters corrupt replicas

2017-04-25 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983264#comment-15983264
 ] 

Wei-Chiu Chuang commented on HDFS-10788:


We are receiving multiple reports that users on CDH versions above CDH5.5.2 are 
still experiencing the same issue. It is not clear at this point if Apache 
Hadoop also carries the same bug, but I thought I should share this information.

> fsck NullPointerException when it encounters corrupt replicas
> -
>
> Key: HDFS-10788
> URL: https://issues.apache.org/jira/browse/HDFS-10788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
> Environment: CDH5.5.2, CentOS 6.7
>Reporter: Jeff Field
>
> Somehow (I haven't found root cause yet) we ended up with blocks that have 
> corrupt replicas where the replica count is inconsistent between the blockmap 
> and the corrupt replicas map. If we try to hdfs fsck any parent directory 
> that has a child with one of these blocks, fsck will exit with something like 
> this:
> {code}
> $ hdfs fsck /path/to/parent/dir/ | egrep -v '^\.+$'
> Connecting to namenode via http://mynamenode:50070
> FSCK started by bot-hadoop (auth:KERBEROS_SSL) from /10.97.132.43 for path 
> /path/to/parent/dir/ at Tue Aug 23 20:34:58 UTC 2016
> .FSCK 
> ended at Tue Aug 23 20:34:59 UTC 2016 in 1098 milliseconds
> null
> Fsck on path '/path/to/parent/dir/' FAILED
> {code}
> So I start at the top, fscking every subdirectory until I find one or more 
> that fails. Then I do the same thing with those directories (our top level 
> directories all have subdirectories with date directories in them, which then 
> contain the files) and once I find a directory with files in it, I run a 
> checksum of the files in that directory. When I do that, I don't get the name 
> of the file, rather I get:
> checksum: java.lang.NullPointerException
> but since the files are in order, I can figure it out by seeing which file 
> was before the NPE. Once I get to this point, I can see the following in the 
> namenode log when I try to checksum the corrupt file:
> 2016-08-23 20:24:59,627 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Inconsistent 
> number of corrupt replicas for blk_1335893388_1100036319546 blockMap has 0 
> but corrupt replicas map has 1
> 2016-08-23 20:24:59,627 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 23 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 
> 192.168.1.100:47785 Call#1 Retry#0
> java.lang.NullPointerException
> At which point I can delete the file, but it is a very tedious process.
> Ideally, shouldn't fsck be able to emit the name of the file that is the 
> source of the problem - and (if -delete is specified) get rid of the file, 
> instead of exiting without saying why?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures

2017-04-25 Thread James Moore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Moore updated HDFS-11701:
---
Description: 
We recently encountered the following NPE due to the DFSInputStream storing old 
cached block locations from hosts which could no longer resolve.

{quote}
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
at 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at 
org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
at 
org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
~HBase related stack frames trimmed~
{quote}

After investigating, the DFSInputStream appears to have been open for upwards 
of 3-4 weeks and had cached block locations from decommissioned nodes that no 
longer resolve in DNS and had been shutdown and removed from the cluster 2 
weeks prior.  If the DFSInputStream had refreshed its block locations from the 
name node, it would have received alternative block locations which would not 
contain the decommissioned data nodes.  As the above NPE leaves the 
non-resolving data node in the list of block locations the DFSInputStream never 
refreshes the block locations and all attempts to open a BlockReader for the 
given blocks will fail.

In our case, we resolved the NPE by closing and re-opening every DFSInputStream 
in the cluster to force a purge of the block locations cache.  Ideally, the 
DFSInputStream would re-fetch all block locations for a host which can't be 
resolved in DNS or at least the blocks requested.




  was:
We recently encountered the following NPE due to the DFSInputStream storing old 
cached block locations from hosts which could no longer resolve.

```Caused by: java.lang.NullPointerException
at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
at 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at 
org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
at 
org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
~HBase related stack frames trimmed~```

After investigating, the DFSInputStream appears to have been open for upwards 
of 3-4 weeks and had cached block locations from decommissioned nodes that no 
longer resolve in DNS and had been shutdown and removed from the cluster 2 
weeks prior.  If the DFSInputStream had refreshed its block locations from the 
name node, it would have received alternative block locations which would not 
contain the decommissioned data nodes.  As the above NPE leaves the 
non-resolving data node in the list of block locations the DFSInputStream never 
refreshes the block locations and all attempts to open a BlockReader for the 
given blocks will fail.

In our case, we resolved the NPE by closing and re-opening every DFSInputStream 
in the cluster to force a purge of the block locations cache.  Ideally, the 
DFSInputStream would re-fetch all block locations for a host which can't be 
resolved in DNS or at least the blocks requested.





> NPE from Unresolved Host causes permanent DFSInputStream failures
> -
>
> Key: HDFS-11701
> URL: https://issues.apache.org/jira/browse/HDFS-11701
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.0
> Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 
> 5.9.0
>Reporter: James Moore
>
> We recently encountered the following NPE due to the DFSInputStream storing 
> old cached block locations from hosts which could no longer resolve.
> {quote}
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
> at 
> 

[jira] [Created] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures

2017-04-25 Thread James Moore (JIRA)
James Moore created HDFS-11701:
--

 Summary: NPE from Unresolved Host causes permanent DFSInputStream 
failures
 Key: HDFS-11701
 URL: https://issues.apache.org/jira/browse/HDFS-11701
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
 Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 
5.9.0
Reporter: James Moore


We recently encountered the following NPE due to the DFSInputStream storing old 
cached block locations from hosts which could no longer resolve.

```Caused by: java.lang.NullPointerException
at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
at 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at 
org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
at 
org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
~HBase related stack frames trimmed~```

After investigating, the DFSInputStream appears to have been open for upwards 
of 3-4 weeks and had cached block locations from decommissioned nodes that no 
longer resolve in DNS and had been shutdown and removed from the cluster 2 
weeks prior.  If the DFSInputStream had refreshed its block locations from the 
name node, it would have received alternative block locations which would not 
contain the decommissioned data nodes.  As the above NPE leaves the 
non-resolving data node in the list of block locations the DFSInputStream never 
refreshes the block locations and all attempts to open a BlockReader for the 
given blocks will fail.

In our case, we resolved the NPE by closing and re-opening every DFSInputStream 
in the cluster to force a purge of the block locations cache.  Ideally, the 
DFSInputStream would re-fetch all block locations for a host which can't be 
resolved in DNS or at least the blocks requested.






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11697) Javadoc of erasure coding policy in file status

2017-04-25 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983063#comment-15983063
 ] 

Takanobu Asanuma commented on HDFS-11697:
-

+1(non-binding), pending Jenkins. Thanks for the patch!

> Javadoc of erasure coding policy in file status
> ---
>
> Key: HDFS-11697
> URL: https://issues.apache.org/jira/browse/HDFS-11697
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
> Attachments: HDFS-11697.01.patch, HDFS-11697.02.patch
>
>
> Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in 
> javadoc explicitly as well as storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11691) Add a proper scheme to the datanode links in NN web UI

2017-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983034#comment-15983034
 ] 

Jason Lowe commented on HDFS-11691:
---

+1 lgtm.  I'll commit this later today if there are no objections.

> Add a proper scheme to the datanode links in NN web UI
> --
>
> Key: HDFS-11691
> URL: https://issues.apache.org/jira/browse/HDFS-11691
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-11691.patch
>
>
> On the datanodes page of the namenode web UI, the datanode links may not be 
> correct if the namenode is serving the page through http but https is also 
> enabled.  This is because {{dfshealth.js}} does not put a proper scheme in 
> front of the address.  It already determines whether the address is 
> non-secure or secure. It can simply prepend {{http:}} or {{https:}} to what 
> it is currently setting.
> The existing mechanism would work for YARN and MAPRED, since they can only 
> serve one protocol, HTTP or HTTPS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11697) Javadoc of erasure coding policy in file status

2017-04-25 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-11697:
--
Attachment: HDFS-11697.02.patch

[~tasanuma0829] Thanks for checking!. I updated accordingly.

> Javadoc of erasure coding policy in file status
> ---
>
> Key: HDFS-11697
> URL: https://issues.apache.org/jira/browse/HDFS-11697
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
> Attachments: HDFS-11697.01.patch, HDFS-11697.02.patch
>
>
> Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in 
> javadoc explicitly as well as storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9803) Proactively refresh ShortCircuitCache entries to avoid latency spikes

2017-04-25 Thread Parag Darji (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982987#comment-15982987
 ] 

Parag Darji commented on HDFS-9803:
---

I'm facing the same issue and seeing slowness in HBase performance. Has anyone 
experienced slowness in HBase?
For now I'm restarting HDFS every three weeks which seems to help a bit.

> Proactively refresh ShortCircuitCache entries to avoid latency spikes
> -
>
> Key: HDFS-9803
> URL: https://issues.apache.org/jira/browse/HDFS-9803
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Nick Dimiduk
>
> My region server logs are flooding with messages like 
> "SecretManager$InvalidToken: access control error while attempting to set up 
> short-circuit access to  ... is expired". These logs 
> correspond with responseTooSlow WARNings from the region server.
> {noformat}
> 2016-01-19 22:10:14,432 INFO  
> [B.defaultRpcServer.handler=4,queue=1,port=16020] 
> shortcircuit.ShortCircuitCache: ShortCircuitCache(0x71bdc547): could not load 
> 1074037633_BP-1145309065-XXX-1448053136416 due to InvalidToken exception.
> org.apache.hadoop.security.token.SecretManager$InvalidToken: access control 
> error while attempting to set up short-circuit access to  token 
> with block_token_identifier (expiryDate=1453194430724, keyId=1508822027, 
> userId=hbase, blockPoolId=BP-1145309065-XXX-1448053136416, 
> blockId=1074037633, access modes=[READ]) is expired.
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
>   at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
> ...
> {noformat}
> A potential solution could be to have a background thread that makes a best 
> effort to proactively refreshes tokens in the cache before they expire, so as 
> to minimize latency impact on the critical path.
> Thanks to [~cnauroth] for providing an explaination and suggesting a solution 
> over on the [user 
> list|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11697) Javadoc of erasure coding policy in file status

2017-04-25 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982864#comment-15982864
 ] 

Takanobu Asanuma edited comment on HDFS-11697 at 4/25/17 1:18 PM:
--

Thanks for the patch, [~lewuathe].

* The javadoc of {{HdfsFileStatus}} constructor also misses the other 
arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them?

* This will raise a checkstyle wanrning, "First sentence should end with a 
period."
{code}
/**
 * Get the erasure coding policy if it's set
 * @return the erasure coding policy
 */
{code}
There are six same warnings in other places in this file. Let's fix them.



was (Author: tasanuma0829):
Thanks for the patch, [~lewuathe].

* The javadoc of {{HdfsFileStatus}} constructor also misses the other 
arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them?

* This will raise a checkstyle wanrning, "First sentence should end with a 
period."
{code}
/**
 * Get the erasure coding policy if it's set
 * @return the erasure coding policy
 */
{code}
There are five same warnings in other places in this file. Let's fix them.


> Javadoc of erasure coding policy in file status
> ---
>
> Key: HDFS-11697
> URL: https://issues.apache.org/jira/browse/HDFS-11697
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
> Attachments: HDFS-11697.01.patch
>
>
> Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in 
> javadoc explicitly as well as storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11697) Javadoc of erasure coding policy in file status

2017-04-25 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982864#comment-15982864
 ] 

Takanobu Asanuma commented on HDFS-11697:
-

Thanks for the patch, [~lewuathe].

* The javadoc of {{HdfsFileStatus}} constructor also misses the other 
arguments, {{symlink}} and {{childrenNum}}. How about adding javadoc for them?

* This will raise a checkstyle wanrning, "First sentence should end with a 
period."
{code}
/**
 * Get the erasure coding policy if it's set
 * @return the erasure coding policy
 */
{code}
There are five same warnings in other places in this file. Let's fix them.


> Javadoc of erasure coding policy in file status
> ---
>
> Key: HDFS-11697
> URL: https://issues.apache.org/jira/browse/HDFS-11697
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
> Attachments: HDFS-11697.01.patch
>
>
> Though {{HdfsFileStatus}} keeps erasure coding policy, it's not shown in 
> javadoc explicitly as well as storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10999) Introduce separate stats for Replicated and Erasure Coded Blocks apart from the current Aggregated stats

2017-04-25 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982805#comment-15982805
 ] 

Takanobu Asanuma commented on HDFS-10999:
-

Hi, [~manojg].

Thanks for uploading the patch. I have checked almost all changed code. Change 
of the interfaces accompanying the additions of new MBeans and Stats looks good 
to me (non-binding). Some comments for the detailed implementations:

*BlockManagerSafeMode.java:*

* How about using {{LongAccumulator}} for {{numberOfBytesInFutureBlocks}}, too?

*CorruptReplicasMap.java:*

* Should this {{decrementBlockStat}} be included in the if statement?

{code:java}
if (datanodes.isEmpty()) {
  // remove the block if there is no more corrupted replicas
  corruptReplicasMap.remove(blk);
  decrementBlockStat(blk);
}
{code}

* It seems package private is enough for new methods 
{{getCorruptReplicatedBlocksStat}} and {{getCorruptStripedBlocksStat}}.

*InvalidateBlocks.java and LowRedundancyBlocks.java:*

Sorry, but I still need more time to review this code.

*For unit tests:*

I think it would be good if we add more unit tests for these changes in this 
jira or follow-on jiras.

* Add more validations for new metrics in {{TestComputeInvalidateWork}}, 
{{TestCorruptReplicaInfo}} and {{TestLowRedundancyBlockQueues}}.

* {{TestUnderReplicatedBlocks}} covers only replicated files. If we use 
{{DFSTestUtil#verifyClientStats}} in {{TestReconstructStripedBlocks}}, we may 
be able to cover the EC case.


> Introduce separate stats for Replicated and Erasure Coded Blocks apart from 
> the current Aggregated stats
> 
>
> Key: HDFS-10999
> URL: https://issues.apache.org/jira/browse/HDFS-10999
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>  Labels: hdfs-ec-3.0-nice-to-have, supportability
> Attachments: HDFS-10999.01.patch, HDFS-10999.02.patch
>
>
> Per HDFS-9857, it seems in the Hadoop 3 world, people prefer the more generic 
> term "low redundancy" to the old-fashioned "under replicated". But this term 
> is still being used in messages in several places, such as web ui, dfsadmin 
> and fsck. We should probably change them to avoid confusion.
> File this jira to discuss it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11580) Ozone: Support asynchronus client API for SCM and containers

2017-04-25 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11580:
-
Attachment: HDFS-11580-HDFS-7240.004.patch

Thanks [~anu], [~vagarychen] and [~msingh] for the great comments! All the 
comments make sense to me. 
The following are my comments:
1.
{quote}
 Do you think we should have 2 functions, like readChunk and readChunkAsync so 
that it looks more like java 8-ish ? rather than a boolean flag ?
{quote}
Has addressed this in the latest patch.

2. The problem of that the current async calls still seem like synchronous 
calls. Yes, I think this should be a problem here. As [~vagarychen] mentioned, 
we should not invoke {{get()}} in the same thread. Maybe we can register a 
callback or other way, I take a look into {{CompletableFuture}}, there are 
already many APIs we can use for this case. In my latest patch, I used one APIs 
named {{CompletableFuture#thenApply}} to deal with the future result 
asynchronous and return the new CompletableFuture object. This should be the 
right way to return the CompletableFuture object for client and let client to 
call future.get().

3.
{quote}
 Also because of the async nature of the interface responses, need not be in 
the same order as the requests. We will need a method to match the response to 
the replies.
{quote}
This is a good catch. If the async interface introduced, we should be more 
carefully to get corresponding response of each request. In my latest patch, I 
defined a new map to store the pending response. The more details can see in 
the method {{XceiverClientHandler#waitForResponse}}.

4.
{quote}
With an async interface, we will always need to keep an eye on the queue depth..
{quote}
Good idea. But I'd like to do this work in another JIRA since current patch 
seems a little big now, :).

I have changed many places in the latest patch. Any comment/suggestion are 
welcomed.

> Ozone: Support asynchronus client API for SCM and containers
> 
>
> Key: HDFS-11580
> URL: https://issues.apache.org/jira/browse/HDFS-11580
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
> Attachments: HDFS-11580-HDFS-7240.001.patch, 
> HDFS-11580-HDFS-7240.002.patch, HDFS-11580-HDFS-7240.003.patch, 
> HDFS-11580-HDFS-7240.004.patch
>
>
> This is an umbrella JIRA that needs to support a set of APIs in Asynchronous 
> form.
> For containers -- or the datanode API currently supports a call 
> {{sendCommand}}. we need to build proper programming interface and support an 
> async interface.
> There is also a set of SCM API that clients can call, it would be nice to 
> support Async interface for those too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang

2017-04-25 Thread xupeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982768#comment-15982768
 ] 

xupeng commented on HDFS-11192:
---

Hi all:

Is there any update on this issue ? 

I have also run into the same condition HDFS-8865 resolved. And wanna know if i 
can merge HDFS-8865 to my repo ?

> OOM during Quota Initialization lead to Namenode hang
> -
>
> Key: HDFS-11192
> URL: https://issues.apache.org/jira/browse/HDFS-11192
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or 
> not able to create,it will not notify the parent.Parent still waiting for the 
> notify call..that's not timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
> "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
> native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11373) Backport HDFS-11258 and HDFS-11272 to branch-2.7

2017-04-25 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982651#comment-15982651
 ] 

Brahma Reddy Battula commented on HDFS-11373:
-

[~ajisakaa] thanks for reporting this.. Patch LGTM. will re-trigger the jenkins.

> Backport HDFS-11258 and HDFS-11272 to branch-2.7
> 
>
> Key: HDFS-11373
> URL: https://issues.apache.org/jira/browse/HDFS-11373
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HDFS-11373-branch-2.7.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-7535) Utilize Snapshot diff report for distcp

2017-04-25 Thread Benjamin Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982590#comment-15982590
 ] 

Benjamin Huo edited comment on HDFS-7535 at 4/25/17 9:01 AM:
-

I've one question regarding the following comments:
"This snapshot diff report represents the delta that should be applied to the 
backup cluster. For changes like deletion and rename we can directly apply the 
same operations (following some specific order based on their dependency) in 
the backup cluster. For changes like creation, append, and other metadata 
modification we keep using the functionality of the current distcp."

I'm not very clear about what "we keep using the functionality of the current 
distcp" means.

After fix HDFS-7535, the file changes list for creation and modification are 
generated based on snapshots s1 and s2 on the source cluster, or it's generated 
based on the file changes between source cluster and destination cluster(with 
extra cost to transfer file list between source and target cluster )?

Thanks
Ben




was (Author: benjaminh):
I've one question regarding the following comments:
"This snapshot diff report represents the delta that should be applied to the 
backup cluster. For changes like deletion and rename we can directly apply the 
same operations (following some specific order based on their dependency) in 
the backup cluster. For changes like creation, append, and other metadata 
modification we keep using the functionality of the current distcp."

I'm not very clear about what "we keep using the functionality of the current 
distcp" means.

After fix HDFS-7535, the file changes list for creation and modification are 
generated based on snapshots s1 and s2 on the source cluster, or it's generated 
based on the file changes between source cluster and destination cluster?

Thanks
Ben



> Utilize Snapshot diff report for distcp
> ---
>
> Key: HDFS-7535
> URL: https://issues.apache.org/jira/browse/HDFS-7535
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.7.0
>
> Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, 
> HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch
>
>
> Currently HDFS snapshot diff report can identify file/directory creation, 
> deletion, rename and modification under a snapshottable directory. We can use 
> the diff report for distcp between the primary cluster and a backup cluster 
> to avoid unnecessary data copy. This is especially useful when there is a big 
> directory rename happening in the primary cluster: the current distcp cannot 
> detect the rename op thus this rename usually leads to large amounts of real 
> data copy.
> More details of the approach will come in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7535) Utilize Snapshot diff report for distcp

2017-04-25 Thread Benjamin Huo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982590#comment-15982590
 ] 

Benjamin Huo commented on HDFS-7535:


I've one question regarding the following comments:
"This snapshot diff report represents the delta that should be applied to the 
backup cluster. For changes like deletion and rename we can directly apply the 
same operations (following some specific order based on their dependency) in 
the backup cluster. For changes like creation, append, and other metadata 
modification we keep using the functionality of the current distcp."

I'm not very clear about what "we keep using the functionality of the current 
distcp" means.

After fix HDFS-7535, the file changes list for creation and modification are 
generated based on snapshots s1 and s2 on the source cluster, or it's generated 
based on the file changes between source cluster and destination cluster?

Thanks
Ben



> Utilize Snapshot diff report for distcp
> ---
>
> Key: HDFS-7535
> URL: https://issues.apache.org/jira/browse/HDFS-7535
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.7.0
>
> Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, 
> HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch
>
>
> Currently HDFS snapshot diff report can identify file/directory creation, 
> deletion, rename and modification under a snapshottable directory. We can use 
> the diff report for distcp between the primary cluster and a backup cluster 
> to avoid unnecessary data copy. This is especially useful when there is a big 
> directory rename happening in the primary cluster: the current distcp cannot 
> detect the rename op thus this rename usually leads to large amounts of real 
> data copy.
> More details of the approach will come in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6708) StorageType should be encoded in the block token

2017-04-25 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982524#comment-15982524
 ] 

Chris Douglas commented on HDFS-6708:
-

[~ehiggs], you're right, that's not going to work with HDFS-9807. Changed it 
back to {{StorageType[]}}

> StorageType should be encoded in the block token
> 
>
> Key: HDFS-6708
> URL: https://issues.apache.org/jira/browse/HDFS-6708
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.4.1
>Reporter: Arpit Agarwal
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-6708.0001.patch, HDFS-6708.0002.patch, 
> HDFS-6708.0003.patch, HDFS-6708.0004.patch, HDFS-6708.0005.patch, 
> HDFS-6708.0006.patch, HDFS-6708.0007.patch, HDFS-6708.0008.patch, 
> HDFS-6708.0009.patch, HDFS-6708.0010.patch
>
>
> HDFS-6702 is adding support for file creation based on StorageType.
> The block token is used as a tamper-proof channel for communicating block 
> parameters from the NN to the DN during block creation. The StorageType 
> should be included in this block token.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-6708) StorageType should be encoded in the block token

2017-04-25 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-6708:

Attachment: HDFS-6708.0010.patch

> StorageType should be encoded in the block token
> 
>
> Key: HDFS-6708
> URL: https://issues.apache.org/jira/browse/HDFS-6708
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.4.1
>Reporter: Arpit Agarwal
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-6708.0001.patch, HDFS-6708.0002.patch, 
> HDFS-6708.0003.patch, HDFS-6708.0004.patch, HDFS-6708.0005.patch, 
> HDFS-6708.0006.patch, HDFS-6708.0007.patch, HDFS-6708.0008.patch, 
> HDFS-6708.0009.patch, HDFS-6708.0010.patch
>
>
> HDFS-6702 is adding support for file creation based on StorageType.
> The block token is used as a tamper-proof channel for communicating block 
> parameters from the NN to the DN during block creation. The StorageType 
> should be included in this block token.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11609) Some blocks can be permanently lost if nodes are decommissioned while dead

2017-04-25 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-11609:
--
Target Version/s: 2.7.4, 2.8.1  (was: 2.8.1)

> Some blocks can be permanently lost if nodes are decommissioned while dead
> --
>
> Key: HDFS-11609
> URL: https://issues.apache.org/jira/browse/HDFS-11609
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Blocker
> Attachments: HDFS-11609.branch-2.patch, HDFS-11609.trunk.patch, 
> HDFS-11609_v2.branch-2.patch, HDFS-11609_v2.trunk.patch
>
>
> When all the nodes containing a replica of a block are decommissioned while 
> they are dead, they get decommissioned right away even if there are missing 
> blocks. This behavior was introduced by HDFS-7374.
> The problem starts when those decommissioned nodes are brought back online. 
> The namenode no longer shows missing blocks, which creates a false sense of 
> cluster health. When the decommissioned nodes are removed and reformatted, 
> the block data is permanently lost. The namenode will report missing blocks 
> after the heartbeat recheck interval (e.g. 10 minutes) from the moment the 
> last node is taken down.
> There are multiple issues in the code. As some cause different behaviors in 
> testing vs. production, it took a while to reproduce it in a unit test. I 
> will present analysis and proposal soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11695) [SPS]: Namenode failed to start while loading SPS xAttrs from the edits log.

2017-04-25 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982436#comment-15982436
 ] 

Uma Maheswara Rao G commented on HDFS-11695:


Thanks [~surendrasingh] for reporting it. Can you explain me the scenario when 
its coming? 
I think you are right. We can reproduce this case in the following case:
 call satisfyStoragePolicy on one directory first. Then try calling satisfy 
policy on parent directory, here it will restricts to satisfy policy  as sub 
directory already has Xattr. But you can explain the case how you got while 
starting NN.

Do you want to fix it? Let me know if any help

> [SPS]: Namenode failed to start while loading SPS xAttrs from the edits log.
> 
>
> Key: HDFS-11695
> URL: https://issues.apache.org/jira/browse/HDFS-11695
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: fsimage.xml
>
>
> {noformat}
> 2017-04-23 13:27:51,971 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> java.io.IOException: Cannot request to call satisfy storage policy on path 
> /ssl, as this file/dir was already called for satisfying storage policy.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSatisfyStoragePolicy(FSDirAttrOp.java:511)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirXAttrOp.unprotectedSetXAttrs(FSDirXAttrOp.java:284)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:918)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:241)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:150)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike

2017-04-25 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982410#comment-15982410
 ] 

Zhe Zhang commented on HDFS-11384:
--

Thanks for the update [~shv]. Now all other tests in {{TestBalancer}} pass 
except for {{testBalancerRPCDelay}}:
{code}
java.util.concurrent.TimeoutException: Timed out waiting for /tmp.txt to reach 
40 replicas

at 
org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:764)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.createFile(TestBalancer.java:306)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:847)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerRPCDelay(TestBalancer.java:2071)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}

> Add option for balancer to disperse getBlocks calls to avoid NameNode's 
> rpc.CallQueueLength spike
> -
>
> Key: HDFS-11384
> URL: https://issues.apache.org/jira/browse/HDFS-11384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.3
>Reporter: yunjiong zhao
>Assignee: Konstantin Shvachko
> Attachments: balancer.day.png, balancer.week.png, 
> HDFS-11384.001.patch, HDFS-11384.002.patch, HDFS-11384.003.patch, 
> HDFS-11384.004.patch, HDFS-11384.005.patch, HDFS-11384.006.patch, 
> HDFS-11384-007.patch, HDFS-11384.008.patch
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes 
> will cause NameNode's rpc.CallQueueLength spike. We observed this situation 
> could cause Hbase cluster failure due to RegionServer's WAL timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org