[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869330#comment-15869330
 ] 

Hadoop QA commented on HDFS-11402:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11402 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852975/HDFS-11402.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 043c43761295 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a136936 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18386/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18386/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18386/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  

[jira] [Commented] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details

2017-02-15 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869270#comment-15869270
 ] 

Surendra Singh Lilhore commented on HDFS-10506:
---

Reviewed and tested the patch.. Looks good to me.

> OIV's ReverseXML processor cannot reconstruct some snapshot details
> ---
>
> Key: HDFS-10506
> URL: https://issues.apache.org/jira/browse/HDFS-10506
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.8.0
>Reporter: Colin P. McCabe
>Assignee: Akira Ajisaka
> Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, 
> HDFS-10506.03.patch
>
>
> OIV's ReverseXML processor cannot reconstruct some snapshot details.  
> Specifically,  should contain a  and  field, 
> but does not.   should contain a  field.  OIV also 
> needs to be changed to emit these fields into the XML (they are currently 
> missing).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2017-02-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HDFS-11419:
---

Assignee: Chen Liang

> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> --
>
> Key: HDFS-11419
> URL: https://issues.apache.org/jira/browse/HDFS-11419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>   chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type, and adds the node to excluded and tries 
> again and again.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take such field(s) in to account, and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types: for one 
> storage type, there is one tree subset that connects all the nodes with that 
> type. And one search happens on only one such subset. So unexpected storage 
> types are simply in the search space. 
> Thanks [~szetszwo] for the offline discussion, and any comments are more than 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2017-02-15 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869262#comment-15869262
 ] 

Yiqun Lin commented on HDFS-11419:
--

I looked into this again. I found something seem not be absolutely accurate.
{quote}
So, for the most part, this code blindly picks random datanodes that do not 
satisfy the storage type, and adds the node to excluded and tries again and 
again.
{quote}
>From the codes of {{BlockPlacementDeaultPolicy#chooseRandom}}, it doesn't add 
>the node to excluded when the random nodes not satisfy the storage type. Only 
>when it is found, then it will do. The related codes:
{code}
...
storage = chooseStorage4Block(
  chosenNode, blocksize, results, entry.getKey());
  if (storage != null) { <=== only when storage was found then 
adding to excluded
numOfReplicas--;
if (firstChosen == null) {
  firstChosen = storage;
}
// add node (subclasses may also add related nodes) to excludedNode
addToExcludedNodes(chosenNode, excludedNodes);
...
{code}

> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> --
>
> Key: HDFS-11419
> URL: https://issues.apache.org/jira/browse/HDFS-11419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>   chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type, and adds the node to excluded and tries 
> again and again.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take such field(s) in to account, and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types: for one 
> storage type, there is one tree subset that connects all the nodes with that 
> type. And one search happens on only one such subset. So unexpected storage 
> types are simply in the search space. 
> Thanks [~szetszwo] for the offline discussion, and any comments are more than 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11417) Add datanode admin command to get the storage info.

2017-02-15 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869251#comment-15869251
 ] 

Surendra Singh Lilhore edited comment on HDFS-11417 at 2/16/17 5:33 AM:


I feel this is required when you have only shell interface to access the 
cluster data.


was (Author: surendrasingh):
I think this is required when you have only shell interface to access the 
cluster data.

> Add datanode admin command to get the storage info.
> ---
>
> Key: HDFS-11417
> URL: https://issues.apache.org/jira/browse/HDFS-11417
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.7.3
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>
> It is good to add one admin command for datanode to get the data directory 
> info like storage type, directory path, number of block, capacity, used 
> space. This will be help full in large cluster where DN has multiple data 
> directory configured. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.

2017-02-15 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869251#comment-15869251
 ] 

Surendra Singh Lilhore commented on HDFS-11417:
---

I think this is required when you have only shell interface to access the 
cluster data.

> Add datanode admin command to get the storage info.
> ---
>
> Key: HDFS-11417
> URL: https://issues.apache.org/jira/browse/HDFS-11417
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.7.3
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>
> It is good to add one admin command for datanode to get the data directory 
> info like storage type, directory path, number of block, capacity, used 
> space. This will be help full in large cluster where DN has multiple data 
> directory configured. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9388) Refactor decommission related code to support maintenance state for datanodes

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869184#comment-15869184
 ] 

Ming Ma commented on HDFS-9388:
---

[~manojg], most of the work has been done by other jiras. Some specific items 
left include the rename of DecommissionManager and if comments about 
decommission should be updated. Please feel free to assign it to yourself. 
Thank you!

> Refactor decommission related code to support maintenance state for datanodes
> -
>
> Key: HDFS-9388
> URL: https://issues.apache.org/jira/browse/HDFS-9388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>
> Lots of code can be shared between the existing decommission functionality 
> and to-be-added maintenance state support for datanodes. To make it easier to 
> add maintenance state support, let us first modify the existing code to make 
> it more general.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869179#comment-15869179
 ] 

Hadoop QA commented on HDFS-10506:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 
55s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852953/HDFS-10506.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d037e9ef6a16 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0741dd3 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18385/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18385/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> OIV's ReverseXML processor cannot reconstruct some snapshot details
> ---
>
> Key: HDFS-10506
> URL: https://issues.apache.org/jira/browse/HDFS-10506
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.8.0
>Reporter: Colin P. McCabe
>Assignee: Akira Ajisaka
> Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, 
> HDFS-10506.03.patch
>
>
> OIV's ReverseXML processor cannot reconstruct some snapshot details.  
> Specifically,  should contain a  and  

[jira] [Commented] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI

2017-02-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869173#comment-15869173
 ] 

Hudson commented on HDFS-11265:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11260 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11260/])
HDFS-11265. Extend visualization for Maintenance Mode under Datanode tab 
(mingma: rev a136936d018b5cebb7aad9a01ea0dcc366e1c3b8)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css


> Extend visualization for Maintenance Mode under Datanode tab in the NameNode 
> UI
> ---
>
> Key: HDFS-11265
> URL: https://issues.apache.org/jira/browse/HDFS-11265
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Elek, Marton
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png
>
>
> With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode 
> page in NameNode UI, but they are lacking icon visualization like the ones 
> shown for other node states. Need to extend the icon visualization to cover 
> Maintenance Mode.
> {code}
> 

[jira] [Commented] (HDFS-11411) Avoid OutOfMemoryError in TestMaintenanceState test runs

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869170#comment-15869170
 ] 

Ming Ma commented on HDFS-11411:


Looks good. Nits:

* For {{testExpectedReplication}} case, should we move the setup() call into 
the function that calls startCluster?
* Maybe at the end of the function, call teardown first, then setup for the 
next iteration. Otherwise, setup will be called twice (one from the test case 
setup and another one from the added explicit call) for the first iteration of 
test case. 

> Avoid OutOfMemoryError in TestMaintenanceState test runs
> 
>
> Key: HDFS-11411
> URL: https://issues.apache.org/jira/browse/HDFS-11411
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11411.01.patch
>
>
> TestMainteananceState test runs are seeing OutOfMemoryError issues quite 
> frequently now. Need to fix tests that are consuming lots of memory/threads. 
> {noformat}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.TestMaintenanceState
> Tests run: 21, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 219.479 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.Te
> testTransitionFromDecommissioned(org.apache.hadoop.hdfs.TestMaintenanceState) 
>  Time elapsed: 0.64 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testTakeDeadNodeOutOfMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) 
>  Time elapsed: 0.031 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testWithNNAndDNRestart(org.apache.hadoop.hdfs.TestMaintenanceState)  Time 
> elapsed: 0.03 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testMultipleNodesMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState)  
> Time elapsed: 60.127 sec  <<< ERROR!
> java.io.IOException: Problem starting http server
> Results :
> Tests in error: 
>   
> TestMaintenanceState.testTransitionFromDecommissioned:225->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s
>   
> TestMaintenanceState.testTakeDeadNodeOutOfMaintenance:636->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s
>   
> TestMaintenanceState.testWithNNAndDNRestart:692->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.startCluste
>   
> TestMaintenanceState.testMultipleNodesMaintenance:532->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.start
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI

2017-02-15 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-11265:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha3
   2.9.0
   Status: Resolved  (was: Patch Available)

Thanks [~elektrobank] for the contribution and [~manojg] for the review. 
Committed to trunk and branch-2.

> Extend visualization for Maintenance Mode under Datanode tab in the NameNode 
> UI
> ---
>
> Key: HDFS-11265
> URL: https://issues.apache.org/jira/browse/HDFS-11265
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Elek, Marton
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png
>
>
> With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode 
> page in NameNode UI, but they are lacking icon visualization like the ones 
> shown for other node states. Need to extend the icon visualization to cover 
> Maintenance Mode.
> {code}
> 

[jira] [Commented] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2017-02-15 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869150#comment-15869150
 ] 

Yiqun Lin commented on HDFS-11419:
--

Great proposal, [~vagarychen]! Actually, this should be a problem that block 
placement policy doesn't using expected storage types in choosing random nodes. 
And that makes some invalid nodes being chosen. Reading from two solutions, I 
prefer to the first solution. As storage type introduced in DataNode, it will 
be better to add this info as the one field of Node. That means which storage 
types are contained in the Node. If we have done this, we just need to do a 
little change on {{NetworkTopology#chooseRandom}}.


> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> --
>
> Key: HDFS-11419
> URL: https://issues.apache.org/jira/browse/HDFS-11419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>   chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type, and adds the node to excluded and tries 
> again and again.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take such field(s) in to account, and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types: for one 
> storage type, there is one tree subset that connects all the nodes with that 
> type. And one search happens on only one such subset. So unexpected storage 
> types are simply in the search space. 
> Thanks [~szetszwo] for the offline discussion, and any comments are more than 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Attachment: HDFS-11402.01.patch

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11402.01.patch
>
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Status: Patch Available  (was: Open)

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11402.01.patch
>
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Status: Open  (was: Patch Available)

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Attachment: (was: HDFS-11402.01.patch)

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Attachment: HDFS-11402.01.patch

Attached v01 patch which addresses the following:

1. {{LeaseManager#getINodeWithLeases()}} returns set of open files in 
{{INodesInPath}} format under any given {{INodeDirectory}}
2. {{INodesInPath#isAncestor()}} can find if the given {{INodeDirectory}} is an 
ancestor of it. Used by {{LeaseManager}} to gather open files under a snap root.
3. New config param {{dfs.namenode.snapshot.freeze.openfiles}} in 
{{DFSConfigKeys}} to turn on/off this feature
4. When {{dfs.namenode.snapshot.freeze.openfiles}} is enabled, 
{{DirectorySnapshottableFeature#addSnapshot}} gathers open files under the snap 
root from {{LeaseManager}} and records modification for each of them to freeze 
the file length in the snapshot.

Tests:
1. {{TestLeaseManager}} updated to test {{LeaseManager#getINodeWithLeases()}} 
functionality
2. TestOpenFilesWithSnapshot, TestSnapshotManager updated to test Snapshots 
when there are files being written -- verify file lengths in snapshots are 
frozen even after more writes goto the live file, verify open files in non snap 
dir not affected, verify file lengths frozen with NN restart, etc.,

[~yzhangal], [~jingzhao], [~andrew.wang] can you please review the patch and 
share your suggestions/comments ?

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11402.01.patch
>
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.

[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11402:
--
Status: Patch Available  (was: Open)

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11402.01.patch
>
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> {{false}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869107#comment-15869107
 ] 

Ming Ma commented on HDFS-11265:


Strictly speaking, live decommissioned nodes can serve read requests as the 
least preferred replicas. But even with that, the existing patch LGTM. +1.

> Extend visualization for Maintenance Mode under Datanode tab in the NameNode 
> UI
> ---
>
> Key: HDFS-11265
> URL: https://issues.apache.org/jira/browse/HDFS-11265
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Elek, Marton
> Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png
>
>
> With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode 
> page in NameNode UI, but they are lacking icon visualization like the ones 
> shown for other node states. Need to extend the icon visualization to cover 
> Maintenance Mode.
> {code}
> 

[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files

2017-02-15 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869091#comment-15869091
 ] 

Manoj Govindassamy commented on HDFS-11402:
---

[~yzhangal],

LeaseManager maintains INodeFileIDs of all leased open files in-memory. 
Moreover, with the proposed design LeaseManager now exposes a functionality to 
get all files with leases under the given directory. So at the time of 
Snapshot, the SnapshotManager will only run extra operations for the open files 
under the snapshot dir and not on all the open files in the system. For all the 
open files, creation of diff record is very light weight as it only copies the 
length field from the file and not blocks. 

Your proposal of maintaining diff record for each open file at the time of file 
length update well before the Snaoshot is _not_ going to make the overall time 
complexity O(1). The diff record computation is not the one contributing to the 
extra time consumption at the Snapshot time. Its the generation of open files 
under the snapshot directory and the diff record saving for each of those open 
files is making the overall complexity for the snapshot creation an O(open 
files count under snap root). And, {{dfs.namenode.snapshot.freeze.openfiles}} 
config is {{false}} by default. Users who are interested in this feature can 
enable this at a marginal cost and otherwise there aren't any performance dips 
in the normal snapshot operation. 

> HDFS Snapshots should capture point-in-time copies of OPEN files
> 
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.6.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in 
> parallel, Snapshots do capture all these files, but these being written files 
> in Snapshots do not have the point-in-time file length captured. That is, 
> these open files are not frozen in HDFS Snapshots. These open files 
> grow/shrink in length, just like the original file, even after the snapshot 
> time.
> 2. At the time of File close or any other meta data modification operation on 
> these files, HDFS reconciles the file length and records the modification in 
> the last taken Snapshot. All the previously taken Snapshots continue to have 
> those open Files with no modification recorded. So, all those previous 
> snapshots end up using the final modification record in the last snapshot. 
> Thus after the file close, file lengths in all those snapshots will end up 
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the 
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1---T2-T3T4-->
> |---Snap1--Snap2-Snap3--->
> |---File1.open---write-write---close->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> 
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can 
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze 
> open files. 
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with 
> {{LeaseManager}} and get a list INodesInPath for all open files under the 
> snapshot dir. 
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, 
> Diff creation and updating modification time, can invoke 
> {{INodeFile#recordModification}} for each of the open files. This way, the 
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for 
> each of the open files. 
> 4. Above model follows the current Snapshot and Diff protocols and doesn't 
> introduce any any disk formats. So, I don't think we will be needing any new 
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of 
> snapshots in O(1) time. LeaseManager though has all the open files with 
> leases in-memory map, an iteration is still needed to prune the needed open 
> files and then run recordModification on each of them. So, it will not be a 
> strict O(1) with the above proposal. But, its going be a marginal increase 
> only as the new order will be of O(open_files_under_snap_dir). In order to 
> avoid HDFS Snapshots change in behavior for open files and avoid change in 
> time complexity, this improvement can be made under a new config 
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be 
> 

[jira] [Commented] (HDFS-11411) Avoid OutOfMemoryError in TestMaintenanceState test runs

2017-02-15 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869075#comment-15869075
 ] 

Manoj Govindassamy commented on HDFS-11411:
---

Tests failures are not related to the patch.

> Avoid OutOfMemoryError in TestMaintenanceState test runs
> 
>
> Key: HDFS-11411
> URL: https://issues.apache.org/jira/browse/HDFS-11411
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11411.01.patch
>
>
> TestMainteananceState test runs are seeing OutOfMemoryError issues quite 
> frequently now. Need to fix tests that are consuming lots of memory/threads. 
> {noformat}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.TestMaintenanceState
> Tests run: 21, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 219.479 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.Te
> testTransitionFromDecommissioned(org.apache.hadoop.hdfs.TestMaintenanceState) 
>  Time elapsed: 0.64 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testTakeDeadNodeOutOfMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) 
>  Time elapsed: 0.031 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testWithNNAndDNRestart(org.apache.hadoop.hdfs.TestMaintenanceState)  Time 
> elapsed: 0.03 sec  <<< ERROR!
> java.lang.OutOfMemoryError: unable to create new native thread
> testMultipleNodesMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState)  
> Time elapsed: 60.127 sec  <<< ERROR!
> java.io.IOException: Problem starting http server
> Results :
> Tests in error: 
>   
> TestMaintenanceState.testTransitionFromDecommissioned:225->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s
>   
> TestMaintenanceState.testTakeDeadNodeOutOfMaintenance:636->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s
>   
> TestMaintenanceState.testWithNNAndDNRestart:692->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.startCluste
>   
> TestMaintenanceState.testMultipleNodesMaintenance:532->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.start
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over

2017-02-15 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869026#comment-15869026
 ] 

Yongjun Zhang commented on HDFS-11352:
--

Thanks [~ajisakaa].


> Potential deadlock in NN when failing over
> --
>
> Key: HDFS-11352
> URL: https://issues.apache.org/jira/browse/HDFS-11352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 2.6.6
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.7.4, 2.6.6
>
> Attachments: HDFS-11352-branch-2.7.000.patch
>
>
> HDFS-11180 fixed a general class of deadlock that can occur when failing over 
>  between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for 
> more details). In trunk and branch-2/branch-2.8 this fix was successful by 
> making the metrics calls not synchronize on FSEditLog.
> In branch-2.6 and branch-2.7 there is one more method, 
> {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the 
> lock on FSEditLog and thus can result in the same deadlock scenario. This can 
> be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the 
> patch in HDFS-11290_ on either of these branches (it fails currently).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details

2017-02-15 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-10506:
-
Attachment: HDFS-10506.03.patch

Rebased.

> OIV's ReverseXML processor cannot reconstruct some snapshot details
> ---
>
> Key: HDFS-10506
> URL: https://issues.apache.org/jira/browse/HDFS-10506
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.8.0
>Reporter: Colin P. McCabe
>Assignee: Akira Ajisaka
> Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, 
> HDFS-10506.03.patch
>
>
> OIV's ReverseXML processor cannot reconstruct some snapshot details.  
> Specifically,  should contain a  and  field, 
> but does not.   should contain a  field.  OIV also 
> needs to be changed to emit these fields into the XML (they are currently 
> missing).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over

2017-02-15 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869008#comment-15869008
 ] 

Akira Ajisaka commented on HDFS-11352:
--

Hi [~yzhangal], yes, it's correct.

> Potential deadlock in NN when failing over
> --
>
> Key: HDFS-11352
> URL: https://issues.apache.org/jira/browse/HDFS-11352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 2.6.6
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.7.4, 2.6.6
>
> Attachments: HDFS-11352-branch-2.7.000.patch
>
>
> HDFS-11180 fixed a general class of deadlock that can occur when failing over 
>  between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for 
> more details). In trunk and branch-2/branch-2.8 this fix was successful by 
> making the metrics calls not synchronize on FSEditLog.
> In branch-2.6 and branch-2.7 there is one more method, 
> {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the 
> lock on FSEditLog and thus can result in the same deadlock scenario. This can 
> be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the 
> patch in HDFS-11290_ on either of these branches (it fails currently).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868977#comment-15868977
 ] 

Ming Ma commented on HDFS-11412:


Thanks [~manojg].

* Regarding whether to use default replication factor or max replication 
factor, do you care about the following use case? default == 3, max = 30. Block 
A have large replication factor 30 and would like to keep at least 20 live 
replicas around during maintenance. Then put 20 nodes with replicas of Block A 
into maintenance at the same time. To make sure at least 20 live replicas after 
maintenance, the system need to honor minReplicationToBeInMaintenance == 20. 
* Impact on {{getExpectedLiveRedundancyNum}} calculation. Set 
minReplicationToBeInMaintenance to 3. Block B's replication factor is 2. Put 
one of its replicas into maintenance. Inside function 
{{getExpectedLiveRedundancyNum}}, {{Math.max(expectedRedundancy - 
numberReplicas.maintenanceReplicas(), getMinReplicationToBeInMaintenance())}} 
== 3.  Ideally the function should return 2.

> Maintenance minimum replication config value allowable range should be {0 - 
> DefaultReplication}
> ---
>
> Key: HDFS-11412
> URL: https://issues.apache.org/jira/browse/HDFS-11412
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11412.01.patch
>
>
> Currently the allowed value range for Maintenance Min Replication 
> {{dfs.namenode.maintenance.replication.min}} is 0 to 
> {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the 
> performance of the cluster would wish to have the Maintenance Min Replication 
> number greater than 1, say 2. In the current design, it is possible to have 
> this Maintenance Min Replication configuration, but only after changing the 
> NameNode level Block Min Replication to 2, and which could slowdown the 
> overall latency for client writes.
> Technically speaking we should be allowing Maintenance Min Replication to be 
> in range 0 to dfs.replication.max.  
> * There is always config value of 0 for users not wanting any 
> availability/performance during maintenance. 
> * And, performance centric workloads can still get maintenance done without 
> major disruptions by having a bigger Maintenance Min Replication. Setting the 
> upper limit as dfs.replication.max could be an overkill as it could trigger 
> re-replication which Maintenance State is trying to avoid. So, we could allow 
> the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to 
> dfs.replication}}
> {noformat}
> if (minMaintenanceR < 0) {
>   throw new IOException("Unexpected configuration parameters: "
>   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>   + " = " + minMaintenanceR + " < 0");
> }
> if (minMaintenanceR > minR) {
>   throw new IOException("Unexpected configuration parameters: "
>   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>   + " = " + minMaintenanceR + " > "
>   + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
>   + " = " + minR);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}

2017-02-15 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868952#comment-15868952
 ] 

Manoj Govindassamy commented on HDFS-11412:
---

Test failures are from the patch. Will submit a revised patch soon. 

> Maintenance minimum replication config value allowable range should be {0 - 
> DefaultReplication}
> ---
>
> Key: HDFS-11412
> URL: https://issues.apache.org/jira/browse/HDFS-11412
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11412.01.patch
>
>
> Currently the allowed value range for Maintenance Min Replication 
> {{dfs.namenode.maintenance.replication.min}} is 0 to 
> {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the 
> performance of the cluster would wish to have the Maintenance Min Replication 
> number greater than 1, say 2. In the current design, it is possible to have 
> this Maintenance Min Replication configuration, but only after changing the 
> NameNode level Block Min Replication to 2, and which could slowdown the 
> overall latency for client writes.
> Technically speaking we should be allowing Maintenance Min Replication to be 
> in range 0 to dfs.replication.max.  
> * There is always config value of 0 for users not wanting any 
> availability/performance during maintenance. 
> * And, performance centric workloads can still get maintenance done without 
> major disruptions by having a bigger Maintenance Min Replication. Setting the 
> upper limit as dfs.replication.max could be an overkill as it could trigger 
> re-replication which Maintenance State is trying to avoid. So, we could allow 
> the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to 
> dfs.replication}}
> {noformat}
> if (minMaintenanceR < 0) {
>   throw new IOException("Unexpected configuration parameters: "
>   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>   + " = " + minMaintenanceR + " < 0");
> }
> if (minMaintenanceR > minR) {
>   throw new IOException("Unexpected configuration parameters: "
>   + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>   + " = " + minMaintenanceR + " > "
>   + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
>   + " = " + minR);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes

2017-02-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868948#comment-15868948
 ] 

Ming Ma commented on HDFS-7877:
---

ok. Will follow up the discussion in HDFS-11412.

> Support maintenance state for datanodes
> ---
>
> Key: HDFS-7877
> URL: https://issues.apache.org/jira/browse/HDFS-7877
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-7877-2.patch, HDFS-7877.patch, 
> Supportmaintenancestatefordatanodes-2.pdf, 
> Supportmaintenancestatefordatanodes.pdf
>
>
> This requirement came up during the design for HDFS-7541. Given this feature 
> is mostly independent of upgrade domain feature, it is better to track it 
> under a separate jira. The design and draft patch will be available soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11414) Ozone : move StorageContainerLocation protocol to hdfs-client

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868898#comment-15868898
 ] 

Hadoop QA commented on HDFS-11414:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
49s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
32s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 86 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
51s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
40s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 12 
new + 86 unchanged - 0 fixed = 98 total (was 86) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
19s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  
org.apache.hadoop.ozone.protocol.proto.StorageContainerLocationProtocolProtos$ContainerRequestProto.PARSER
 isn't final but should be  At StorageContainerLocationProtocolProtos.java:be  
At StorageContainerLocationProtocolProtos.java:[line 2859] |
|  |  Class 
org.apache.hadoop.ozone.protocol.proto.StorageContainerLocationProtocolProtos$ContainerRequestProto
 defines non-transient non-serializable instance field unknownFields  In 
StorageContainerLocationProtocolProtos.java:instance field unknownFields  In 

[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over

2017-02-15 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868884#comment-15868884
 ] 

Yongjun Zhang commented on HDFS-11352:
--

HI [~xkrogen] and [~ajisakaa],

Thanks for your work here.

It looks to me that the reason trunk doesn't need this patch because it has 
HDFS-7501. Because HDFS-7501 is not backported to 2.7.x and 2.6.x, we had the 
need for HDFS-11352 here.  Does that sound correct to you?

Thanks.





> Potential deadlock in NN when failing over
> --
>
> Key: HDFS-11352
> URL: https://issues.apache.org/jira/browse/HDFS-11352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 2.6.6
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Critical
>  Labels: high-availability
> Fix For: 2.7.4, 2.6.6
>
> Attachments: HDFS-11352-branch-2.7.000.patch
>
>
> HDFS-11180 fixed a general class of deadlock that can occur when failing over 
>  between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for 
> more details). In trunk and branch-2/branch-2.8 this fix was successful by 
> making the metrics calls not synchronize on FSEditLog.
> In branch-2.6 and branch-2.7 there is one more method, 
> {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the 
> lock on FSEditLog and thus can result in the same deadlock scenario. This can 
> be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the 
> patch in HDFS-11290_ on either of these branches (it fails currently).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11414) Ozone : move StorageContainerLocation protocol to hdfs-client

2017-02-15 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-11414:
--
Attachment: HDFS-11414-HDFS-7240.003.patch

fix the missing package-info warnings.

> Ozone : move StorageContainerLocation protocol to hdfs-client
> -
>
> Key: HDFS-11414
> URL: https://issues.apache.org/jira/browse/HDFS-11414
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11414-HDFS-7240.001.patch, 
> HDFS-11414-HDFS-7240.002.patch, HDFS-11414-HDFS-7240.003.patch
>
>
> {{StorageContainerLocation}} classes are client-facing classes of containers, 
> similar to {{XceiverClient}}, so they should be moved to hadoop-hdfs-client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11184) Ozone: SCM: Make SCM use container protocol

2017-02-15 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868753#comment-15868753
 ] 

Chen Liang commented on HDFS-11184:
---

v005 patch LGTM, +1

> Ozone: SCM: Make SCM use container protocol
> ---
>
> Key: HDFS-11184
> URL: https://issues.apache.org/jira/browse/HDFS-11184
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-11184-HDFS-7240.001.patch, 
> HDFS-11184-HDFS-7240.002.patch, HDFS-11184-HDFS-7240.003.patch, 
> HDFS-11184-HDFS-7240.004.patch, HDFS-11184-HDFS-7240.005.patch
>
>
> SCM will start using container protocol to communicate with datanodes. 
> This change introduces some test failures due to some missing features which 
> will be moved to KSM. Will file separate JIRA to track disabled ozone tests. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2017-02-15 Thread Chen Liang (JIRA)
Chen Liang created HDFS-11419:
-

 Summary: BlockPlacementPolicyDefault is choosing datanode in an 
inefficient way
 Key: HDFS-11419
 URL: https://issues.apache.org/jira/browse/HDFS-11419
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Chen Liang


Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
calling into {{chooseRandom}}, which will first find a random datanode by 
calling
{code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
excludedNodes);{code}, then it checks whether that returned datanode satisfies 
storage type requirement
{code}storage = chooseStorage4Block(
  chosenNode, blocksize, results, entry.getKey());{code}
If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
and runs the loop again until {{numOfReplicas}} is down to 0.

A problem here is that, storage type is not being considered only until after a 
random node is already returned.  We've seen a case where a cluster has a large 
number of datanodes, while only a few satisfy the storage type condition. So, 
for the most part, this code blindly picks random datanodes that do not satisfy 
the storage type, and adds the node to excluded and tries again and again.

To make matters worse, the way {{NetworkTopology#chooseRandom}} works is that, 
given a set of excluded nodes, it first finds a random datanodes, then if it is 
in excluded nodes set, try find another random nodes. So the more excluded 
nodes there are, the more likely a random node will be in the excluded set, in 
which case we basically wasted one iteration.

Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
that datanodes can be found more efficiently. There are currently two different 
high level solutions we are considering:

1. add some field to Node base types to describe the storage type info, and 
when searching for a node, we take such field(s) in to account, and do not 
return node that does not meet the storage type requirement.

2. change {{NetworkTopology}} class to be aware of storage types: for one 
storage type, there is one tree subset that connects all the nodes with that 
type. And one search happens on only one such subset. So unexpected storage 
types are simply in the search space. 

Thanks [~szetszwo] for the offline discussion, and any comments are more than 
welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11375) Display the volume storage type in datanode UI

2017-02-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868573#comment-15868573
 ] 

Hudson commented on HDFS-11375:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11259 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11259/])
HDFS-11375. Display the volume storage type in datanode UI. Contributed 
(liuml07: rev 0741dd3b9abdeb65bb783c1a8b01f078c4bdba17)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode/datanode.html


> Display the volume storage type in datanode UI
> --
>
> Key: HDFS-11375
> URL: https://issues.apache.org/jira/browse/HDFS-11375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, 
> HDFS-11375.02.patch
>
>
> Volume storage info is useful for debugging the issue related to policy... 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11375) Display the volume storage type in datanode UI

2017-02-15 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-11375:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha3
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}} and {{branch-2}} branches. Thanks for your contribution 
[~surendrasingh]. Thanks for your review [~brahmareddy] and [~hanishakoneru].

> Display the volume storage type in datanode UI
> --
>
> Key: HDFS-11375
> URL: https://issues.apache.org/jira/browse/HDFS-11375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, 
> HDFS-11375.02.patch
>
>
> Volume storage info is useful for debugging the issue related to policy... 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11418) HttpFS should support old SSL clients

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868527#comment-15868527
 ] 

Hadoop QA commented on HDFS-11418:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
51s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m 
13s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
58s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
7s{color} | {color:red} The patch generated 18 new + 518 unchanged - 0 fixed = 
536 total (was 518) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed with JDK 
v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-11418 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852895/HDFS-11418.branch-2.001.patch
 |
| Optional Tests |  asflicense  mvnsite  unit  shellcheck  shelldocs  compile  
javac  javadoc  mvninstall  xml  |
| uname | Linux 63d601be64cb 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / 323782b |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_121 

[jira] [Commented] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868510#comment-15868510
 ] 

Wei-Chiu Chuang commented on HDFS-11225:


Actually, because INodeDirectory.cleanSubtree is invoked in a few other places, 
(e.g. FSDirectory.unprotectedDelete, FSDirectory.unprotectedRenameTo), delete 
and rename operations may also suffer from the same bug.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and sent the edit to journal nodes, but it was 
> rejected because epoch was updated. See the following stacktrace:
> {noformat}
> 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
> at 
> 

[jira] [Updated] (HDFS-11418) HttpFS should support old SSL clients

2017-02-15 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-11418:
--
Attachment: HDFS-11418.branch-2.001.patch

Patch branch-2.001
* Add env HTTPFS_SSL_CIPHERS, default to a list of selected ciphers
* Configure Tomcat to accept a list of ciphers

TODO
* Discuss Allen's idea of strong security by default

Testing done
* hadoop-hdfs-httpfs unit tests
* Verify HTTPFS_SSL_CIPHERS value on stdout during httpfs startup
* Run https://github.com/jzhuge/hadoop-bats-tests/blob/master/httpfs.bats in 
insecure, SSL, and SSL+Kerberos single node setup
* Sslcan result should include only listed ciphers
* On Centos 6.6, run the following curl command. Expect {{NSS error -12286}} 
without the fix.
{noformat}
curl -v -k --negotiate -u: -sS 
'https://HTTPFS_HOST:14000/webhdfs/v1/?op=liststatus'
{noformat}

> HttpFS should support old SSL clients
> -
>
> Key: HDFS-11418
> URL: https://issues.apache.org/jira/browse/HDFS-11418
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.8.0, 2.7.4, 2.6.6
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-11418.branch-2.001.patch
>
>
> HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL 
> clients such as curl stop working. The symptom is {{NSS error -12286}} when 
> running {{curl -v}}.
> Instead of forcing the SSL clients to upgrade, we can configure Tomcat to 
> explicitly allow enough weak ciphers so that old SSL clients can work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11418) HttpFS should support old SSL clients

2017-02-15 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-11418:
--
Status: Patch Available  (was: Open)

> HttpFS should support old SSL clients
> -
>
> Key: HDFS-11418
> URL: https://issues.apache.org/jira/browse/HDFS-11418
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.8.0, 2.7.4, 2.6.6
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HDFS-11418.branch-2.001.patch
>
>
> HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL 
> clients such as curl stop working. The symptom is {{NSS error -12286}} when 
> running {{curl -v}}.
> Instead of forcing the SSL clients to upgrade, we can configure Tomcat to 
> explicitly allow enough weak ciphers so that old SSL clients can work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868472#comment-15868472
 ] 

Hadoop QA commented on HDFS-10899:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
57s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10899 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852861/HDFS-10899.09.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  xml  |
| uname | Linux 9223997301ec 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|

[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868464#comment-15868464
 ] 

Hadoop QA commented on HDFS-10899:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
6s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 28s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10899 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852861/HDFS-10899.09.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  xml  |
| uname | Linux 0d52057c4281 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0fc6f38 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 

[jira] [Commented] (HDFS-11375) Display the volume storage type in datanode UI

2017-02-15 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868453#comment-15868453
 ] 

Mingliang Liu commented on HDFS-11375:
--

+1

The failing tests are not related; let's fix checkstyle warnings with existing 
ones.

> Display the volume storage type in datanode UI
> --
>
> Key: HDFS-11375
> URL: https://issues.apache.org/jira/browse/HDFS-11375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, 
> HDFS-11375.02.patch
>
>
> Volume storage info is useful for debugging the issue related to policy... 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.

2017-02-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868417#comment-15868417
 ] 

Kihwal Lee commented on HDFS-11417:
---

Sorry, it is actually you who are working on both issues! 

> Add datanode admin command to get the storage info.
> ---
>
> Key: HDFS-11417
> URL: https://issues.apache.org/jira/browse/HDFS-11417
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.7.3
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>
> It is good to add one admin command for datanode to get the data directory 
> info like storage type, directory path, number of block, capacity, used 
> space. This will be help full in large cluster where DN has multiple data 
> directory configured. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.

2017-02-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868413#comment-15868413
 ] 

Kihwal Lee commented on HDFS-11417:
---

More info was added to the datanoed UI. HDFS-11375 is improving it. You might 
want to give them input if something you think is important is missing.

> Add datanode admin command to get the storage info.
> ---
>
> Key: HDFS-11417
> URL: https://issues.apache.org/jira/browse/HDFS-11417
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.7.3
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>
> It is good to add one admin command for datanode to get the data directory 
> info like storage type, directory path, number of block, capacity, used 
> space. This will be help full in large cluster where DN has multiple data 
> directory configured. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size

2017-02-15 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868412#comment-15868412
 ] 

Jing Zhao commented on HDFS-8498:
-

[~jojochuang], currently I do not plan to backport this change to branch 2.x. 
But please feel free to do it if you think it's necessary and I will be happy 
to review.

> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch
>
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size

2017-02-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868394#comment-15868394
 ] 

Hudson commented on HDFS-8498:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11258 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11258/])
HDFS-8498. Blocks can be committed with wrong size. Contributed by Jing (jing9: 
rev 627da6f7178e18aa41996969c408b6f344e297d1)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripedDataStreamer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch
>
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868354#comment-15868354
 ] 

Wei-Chiu Chuang commented on HDFS-8498:
---

[~jingzhao] very nice work!
Do you plan to cherry pick the fix into 2.x branches?

Thanks!

> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch
>
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size

2017-02-15 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868333#comment-15868333
 ] 

Jing Zhao commented on HDFS-8498:
-

I've committed the patch into trunk.

> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch
>
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8498) Blocks can be committed with wrong size

2017-02-15 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8498:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha3
   Status: Resolved  (was: Patch Available)

Thanks for the review, [~jnp]! I will commit the patch shortly.

> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch
>
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11418) HttpFS should support old SSL clients

2017-02-15 Thread John Zhuge (JIRA)
John Zhuge created HDFS-11418:
-

 Summary: HttpFS should support old SSL clients
 Key: HDFS-11418
 URL: https://issues.apache.org/jira/browse/HDFS-11418
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: httpfs
Affects Versions: 2.8.0, 2.7.4, 2.6.6
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Minor


HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL 
clients such as curl stop working. The symptom is {{NSS error -12286}} when 
running {{curl -v}}.

Instead of forcing the SSL clients to upgrade, we can configure Tomcat to 
explicitly allow enough weak ciphers so that old SSL clients can work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11417) Add datanode admin command to get the storage info.

2017-02-15 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-11417:
-

 Summary: Add datanode admin command to get the storage info.
 Key: HDFS-11417
 URL: https://issues.apache.org/jira/browse/HDFS-11417
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.7.3
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


It is good to add one admin command for datanode to get the data directory info 
like storage type, directory path, number of block, capacity, used space. This 
will be help full in large cluster where DN has multiple data directory 
configured. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10899:
-
Attachment: (was: HDFS-10899.09.patch)

> Add functionality to re-encrypt EDEKs.
> --
>
> Key: HDFS-10899
> URL: https://issues.apache.org/jira/browse/HDFS-10899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, 
> HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, 
> HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, 
> HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt 
> edek design doc.pdf
>
>
> Currently when an encryption zone (EZ) key is rotated, it only takes effect 
> on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key 
> rotation, for improved security.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10899:
-
Attachment: HDFS-10899.09.patch

> Add functionality to re-encrypt EDEKs.
> --
>
> Key: HDFS-10899
> URL: https://issues.apache.org/jira/browse/HDFS-10899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, 
> HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, 
> HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, 
> HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt 
> edek design doc.pdf
>
>
> Currently when an encryption zone (EZ) key is rotated, it only takes effect 
> on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key 
> rotation, for improved security.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10899:
-
Attachment: (was: HDFS-10899.09.patch)

> Add functionality to re-encrypt EDEKs.
> --
>
> Key: HDFS-10899
> URL: https://issues.apache.org/jira/browse/HDFS-10899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, 
> HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, 
> HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, 
> HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt 
> edek design doc.pdf
>
>
> Currently when an encryption zone (EZ) key is rotated, it only takes effect 
> on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key 
> rotation, for improved security.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10899:
-
Attachment: HDFS-10899.09.patch

> Add functionality to re-encrypt EDEKs.
> --
>
> Key: HDFS-10899
> URL: https://issues.apache.org/jira/browse/HDFS-10899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, 
> HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, 
> HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, 
> HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt 
> edek design doc.pdf
>
>
> Currently when an encryption zone (EZ) key is rotated, it only takes effect 
> on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key 
> rotation, for improved security.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6301) NameNode: persist XAttrs in fsimage and record XAttrs modifications to edit log.

2017-02-15 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868107#comment-15868107
 ] 

Xiao Chen commented on HDFS-6301:
-

FYI - created HDFS-11410 for the above and Andrew +1'ed, plan to commit today.

> NameNode: persist XAttrs in fsimage and record XAttrs modifications to edit 
> log.
> 
>
> Key: HDFS-6301
> URL: https://issues.apache.org/jira/browse/HDFS-6301
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS XAttrs (HDFS-2006)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: HDFS XAttrs (HDFS-2006)
>
> Attachments: HDFS-6301.1.patch, HDFS-6301.patch
>
>
> Store XAttrs in fsimage so that XAttrs are retained across NameNode restarts.
> Implement a new edit log opcode, {{OP_SET_XATTRS}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11100) Recursively deleting file protected by sticky bit should fail

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867968#comment-15867968
 ] 

Wei-Chiu Chuang commented on HDFS-11100:


I will commit the 005 patch to trunk by end of day if there are no objections.

> Recursively deleting file protected by sticky bit should fail
> -
>
> Key: HDFS-11100
> URL: https://issues.apache.org/jira/browse/HDFS-11100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
>  Labels: permissions
> Attachments: HDFS-11100.001.patch, HDFS-11100.002.patch, 
> HDFS-11100.003.patch, HDFS-11100.004.patch, HDFS-11100.005.patch, hdfs_cmds
>
>
> Recursively deleting a directory that contains files or directories protected 
> by sticky bit should fail but it doesn't in HDFS. In the case below, 
> {{/tmp/test/sticky_dir/f2}} is protected by sticky bit, thus recursive 
> deleting {{/tmp/test/sticky_dir}} should fail.
> {noformat}
> + hdfs dfs -ls -R /tmp/test
> drwxrwxrwt   - jzhuge supergroup  0 2016-11-03 18:08 
> /tmp/test/sticky_dir
> -rwxrwxrwx   1 jzhuge supergroup  0 2016-11-03 18:08 
> /tmp/test/sticky_dir/f2
> + sudo -u hadoop hdfs dfs -rm -skipTrash /tmp/test/sticky_dir/f2
> rm: Permission denied by sticky bit: user=hadoop, 
> path="/tmp/test/sticky_dir/f2":jzhuge:supergroup:-rwxrwxrwx, 
> parent="/tmp/test/sticky_dir":jzhuge:supergroup:drwxrwxrwt
> + sudo -u hadoop hdfs dfs -rm -r -skipTrash /tmp/test/sticky_dir
> Deleted /tmp/test/sticky_dir
> {noformat}
> Centos 6.4 behavior:
> {noformat}
> $ ls -lR /tmp/test
> /tmp/test: 
> total 4
> drwxrwxrwt 2 systest systest 4096 Nov  3 18:36 sbit
> /tmp/test/sbit:
> total 0
> -rw-rw-rw- 1 systest systest 0 Nov  2 13:45 f2
> $ sudo -u mapred rm -fr /tmp/test/sbit
> rm: cannot remove `/tmp/test/sbit/f2': Operation not permitted
> $ chmod -t /tmp/test/sbit
> $ sudo -u mapred rm -fr /tmp/test/sbit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11333) Print a user friendly error message when plugins are not found

2017-02-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867677#comment-15867677
 ] 

Hudson commented on HDFS-11333:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11256 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11256/])
HDFS-11333. Print a user friendly error message when plugins are not (weichiu: 
rev 859bd159ae554174200334b5eb1d7e8dbef958ad)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


> Print a user friendly error message when plugins are not found
> --
>
> Key: HDFS-11333
> URL: https://issues.apache.org/jira/browse/HDFS-11333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, 
> HDFS-11333.002.patch, HDFS-11333.003.patch
>
>
> If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it 
> terminates abruptly with the following stack trace:
> {quote}
> Failed to start namenode.
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not 
> found
>   at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178)
>   at 
> org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
> {quote}
> We should catch this exception, log a warning message and let it proceed, as 
> missing the third party library does not affect the functionality of 
> NameNode. We caught this bug during a CDH upgrade where a third party plugin 
> was not in the lib directory of the newer version of CDH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11333:
---
   Resolution: Fixed
Fix Version/s: 2.8.1
   3.0.0-alpha3
   2.7.4
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed the patch to trunk, branch-2, branch-2.8 and branch-2.7.
Thanks very much for the review from [~linyiqun] and comments from [~aw]!

> Print a user friendly error message when plugins are not found
> --
>
> Key: HDFS-11333
> URL: https://issues.apache.org/jira/browse/HDFS-11333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, 
> HDFS-11333.002.patch, HDFS-11333.003.patch
>
>
> If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it 
> terminates abruptly with the following stack trace:
> {quote}
> Failed to start namenode.
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not 
> found
>   at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178)
>   at 
> org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
> {quote}
> We should catch this exception, log a warning message and let it proceed, as 
> missing the third party library does not affect the functionality of 
> NameNode. We caught this bug during a CDH upgrade where a third party plugin 
> was not in the lib directory of the newer version of CDH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11333:
---
Priority: Minor  (was: Major)

> Print a user friendly error message when plugins are not found
> --
>
> Key: HDFS-11333
> URL: https://issues.apache.org/jira/browse/HDFS-11333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, 
> HDFS-11333.002.patch, HDFS-11333.003.patch
>
>
> If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it 
> terminates abruptly with the following stack trace:
> {quote}
> Failed to start namenode.
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not 
> found
>   at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178)
>   at 
> org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
> {quote}
> We should catch this exception, log a warning message and let it proceed, as 
> missing the third party library does not affect the functionality of 
> NameNode. We caught this bug during a CDH upgrade where a third party plugin 
> was not in the lib directory of the newer version of CDH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11333:
---
Issue Type: Improvement  (was: Bug)

> Print a user friendly error message when plugins are not found
> --
>
> Key: HDFS-11333
> URL: https://issues.apache.org/jira/browse/HDFS-11333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, 
> HDFS-11333.002.patch, HDFS-11333.003.patch
>
>
> If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it 
> terminates abruptly with the following stack trace:
> {quote}
> Failed to start namenode.
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not 
> found
>   at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178)
>   at 
> org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
> {quote}
> We should catch this exception, log a warning message and let it proceed, as 
> missing the third party library does not affect the functionality of 
> NameNode. We caught this bug during a CDH upgrade where a third party plugin 
> was not in the lib directory of the newer version of CDH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11333:
---
Summary: Print a user friendly error message when plugins are not found  
(was: Namenode unable to start if plugins can not be found)

> Print a user friendly error message when plugins are not found
> --
>
> Key: HDFS-11333
> URL: https://issues.apache.org/jira/browse/HDFS-11333
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, 
> HDFS-11333.002.patch, HDFS-11333.003.patch
>
>
> If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it 
> terminates abruptly with the following stack trace:
> {quote}
> Failed to start namenode.
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not 
> found
>   at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178)
>   at 
> org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
> {quote}
> We should catch this exception, log a warning message and let it proceed, as 
> missing the third party library does not affect the functionality of 
> NameNode. We caught this bug during a CDH upgrade where a third party plugin 
> was not in the lib directory of the newer version of CDH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"

2017-02-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867561#comment-15867561
 ] 

Brahma Reddy Battula commented on HDFS-6804:


[~jojochuang] As you mentioned , issue was fixed by HDFS-11160..Just I updated 
the testcase for this particular scenario which is very race.Verified by 
reverting HDFS-11160 and HDFS-11056.

> race condition between transferring block and appending block causes 
> "Unexpected checksum mismatch exception" 
> --
>
> Key: HDFS-6804
> URL: https://issues.apache.org/jira/browse/HDFS-6804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: Gordon Wang
>Assignee: Brahma Reddy Battula
> Attachments: Testcase_append_transfer_block.patch
>
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException: 
> Unexpected checksum mismatch while writing 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from 
> /192.168.2.101:39495
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Da
> taTransfer: Transmitted 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad 
> block to NameNode and NameNode marks the replica on the source datanode as 
> corrupt. But actually, the replica on the source datanode is valid. Because 
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867528#comment-15867528
 ] 

Hadoop QA commented on HDFS-10899:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
43s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m 43s{color} | 
{color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 43s{color} 
| {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
37s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 2 new + 7 
unchanged - 0 fixed = 9 total (was 7) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10899 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12852766/HDFS-10899.09.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  xml  |
| uname | Linux 3135be15d837 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cd3e59a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18379/artifact/patchprocess/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| compile | 

[jira] [Reopened] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-6804:
---

[~brahmareddy] I haven't run the patch, but if you attach a patch I am happy to 
review it. Assigning the jira to you, thanks!

> race condition between transferring block and appending block causes 
> "Unexpected checksum mismatch exception" 
> --
>
> Key: HDFS-6804
> URL: https://issues.apache.org/jira/browse/HDFS-6804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: Gordon Wang
>Assignee: Brahma Reddy Battula
> Attachments: Testcase_append_transfer_block.patch
>
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException: 
> Unexpected checksum mismatch while writing 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from 
> /192.168.2.101:39495
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Da
> taTransfer: Transmitted 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad 
> block to NameNode and NameNode marks the replica on the source datanode as 
> corrupt. But actually, the replica on the source datanode is valid. Because 
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"

2017-02-15 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-6804:
-

Assignee: Brahma Reddy Battula  (was: Wei-Chiu Chuang)

> race condition between transferring block and appending block causes 
> "Unexpected checksum mismatch exception" 
> --
>
> Key: HDFS-6804
> URL: https://issues.apache.org/jira/browse/HDFS-6804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.2.0
>Reporter: Gordon Wang
>Assignee: Brahma Reddy Battula
> Attachments: Testcase_append_transfer_block.patch
>
>
> We found some error log in the datanode. like this
> {noformat}
> 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Ex
> ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248
> java.io.IOException: Terminating due to a checksum error.java.io.IOException: 
> Unexpected checksum mismatch while writing 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from 
> /192.168.2.101:39495
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
> While on the source datanode, the log says the block is transmitted.
> {noformat}
> 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Da
> taTransfer: Transmitted 
> BP-2072804351-192.168.2.104-1406008383435:blk_1073741997
> _9248 (numBytes=16188152) to /192.168.2.103:50010
> {noformat}
> When the destination datanode gets the checksum mismatch, it reports bad 
> block to NameNode and NameNode marks the replica on the source datanode as 
> corrupt. But actually, the replica on the source datanode is valid. Because 
> the replica can pass the checksum verification.
> In all, the replica on the source data is wrongly marked as corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11391) Numeric usernames do no work with WebHDFS FS (write access)

2017-02-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867501#comment-15867501
 ] 

ASF GitHub Bot commented on HDFS-11391:
---

Github user pvillard31 closed the pull request at:

https://github.com/apache/hadoop/pull/186


> Numeric usernames do no work with WebHDFS FS (write access)
> ---
>
> Key: HDFS-11391
> URL: https://issues.apache.org/jira/browse/HDFS-11391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Pierre Villard
>Assignee: Pierre Villard
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1
>
>
> In HDFS-4983, a property has been introduced to configure the pattern 
> validating name of users interacting with WebHDFS because default pattern was 
> excluding names starting with numbers.
> Problem is that this fix works only for read access. In case of write access 
> against data node, the default pattern is still applied whatever the 
> configuration is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11391) Numeric usernames do no work with WebHDFS FS (write access)

2017-02-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867500#comment-15867500
 ] 

ASF GitHub Bot commented on HDFS-11391:
---

Github user pvillard31 commented on the issue:

https://github.com/apache/hadoop/pull/186
  
Committed as part of 8e53f2b9b08560bf4f8e81e697063277dbdc68f9. Closing.


> Numeric usernames do no work with WebHDFS FS (write access)
> ---
>
> Key: HDFS-11391
> URL: https://issues.apache.org/jira/browse/HDFS-11391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Pierre Villard
>Assignee: Pierre Villard
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1
>
>
> In HDFS-4983, a property has been introduced to configure the pattern 
> validating name of users interacting with WebHDFS because default pattern was 
> excluding names starting with numbers.
> Problem is that this fix works only for read access. In case of write access 
> against data node, the default pattern is still applied whatever the 
> configuration is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.

2017-02-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-10899:
-
Attachment: HDFS-10899.09.patch

patch 9:
- added a throttler. Note this won't be very accurate as we do not want 
frequent lock/unlock, and the throttling has to happen out of locks after a 
batch is created. Better approaches welcome. (Wasn't sure if follow-on means 
separate jira preferable. can split that out if so to keep the central patch 
size sane.).
- Should have mentioned in patch 8: I think after refactor the code is more 
readable, but seems we can't lock/unlock within each method. This is similar to 
{{FSDirEncryptionZoneOp#getEncryptionKeyInfo}}
- Fixed precommit complains.

> Add functionality to re-encrypt EDEKs.
> --
>
> Key: HDFS-10899
> URL: https://issues.apache.org/jira/browse/HDFS-10899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption, kms
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, 
> HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, 
> HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, 
> HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt 
> edek design doc.pdf
>
>
> Currently when an encryption zone (EZ) key is rotated, it only takes effect 
> on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key 
> rotation, for improved security.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11238) Fix checkstyle warnings in NameNode#createNameNode

2017-02-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867461#comment-15867461
 ] 

Hudson commented on HDFS-11238:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11252 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11252/])
HDFS-11238. Fix checkstyle warnings in NameNode#createNameNode. (aajisaka: rev 
8acb376c9c5f7f52a097be221ed18877a403bece)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


> Fix checkstyle warnings in NameNode#createNameNode
> --
>
> Key: HDFS-11238
> URL: https://issues.apache.org/jira/browse/HDFS-11238
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Trivial
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11238.001.patch
>
>
> switch-case should at the same level, avoid nested block, Array brackets at 
> illegal position



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11401) Reduce verbosity of logs with favored nodes and block pinning enabled

2017-02-15 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867450#comment-15867450
 ] 

Weiwei Yang commented on HDFS-11401:


Hi [~rakeshr], [~umamaheswararao]

Please let me know if you have any comments with v2 patch, thank you!

> Reduce verbosity of logs with favored nodes and block pinning enabled
> -
>
> Key: HDFS-11401
> URL: https://issues.apache.org/jira/browse/HDFS-11401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover, datanode
>Affects Versions: 2.8.0
>Reporter: Thiruvel Thirumoolan
>Assignee: Weiwei Yang
>Priority: Minor
> Attachments: HDFS-11401.01.patch, HDFS-11401.02.patch, 
> testBalancerWithPinnedBlocks.output.txt
>
>
> I am working on enabling favored nodes for HBase (HBASE-15531). Was trying 
> out what happens if favored nodes is used and HDFS balancer is enabled. Ran 
> the unit test TestBalancer#testBalancerWithPinnedBlocks from branch-2.8.  
> There were too many exceptions and error messages with this (output attached) 
> since pinned blocks can't be moved.
> Is there any way to reduce this logging since block pinning is intentional? 
> On a real cluster, this could be too much. HDFS-6133 enabled block pinning.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org