[jira] [Commented] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564335#comment-15564335
 ] 

Hadoop QA commented on HDFS-10906:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 598 new + 0 unchanged - 0 fixed = 598 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 468 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 78m  
5s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10906 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832592/HDFS-10906.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ca54550115a4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17095/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17095/artifact/patchprocess/whitespace-tabs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17095/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17095/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add unit tests for Trash with HDFS encryption zones
> ---
>
> Key: HDFS-10906
> URL: https://issues.apache.org/jira/browse/HDFS-10906
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  

[jira] [Created] (HDFS-10994) Support "XOR-2-1-64k" policy in "hdfs erasurecode" command

2016-10-10 Thread SammiChen (JIRA)
SammiChen created HDFS-10994:


 Summary: Support "XOR-2-1-64k" policy in "hdfs erasurecode" command
 Key: HDFS-10994
 URL: https://issues.apache.org/jira/browse/HDFS-10994
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: SammiChen
Assignee: SammiChen


So far, "hdfs erasurecode" command supports three policies, RS-DEFAULT-3-2-64k, 
RS-DEFAULT-6-3-64k and RS-LEGACY-6-3-64k. This task is going to add XOR-2-1-64k 
policy to this command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564257#comment-15564257
 ] 

Hadoop QA commented on HDFS-10986:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10986 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832584/HDFS-10986.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7c60a3d22a5e 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17094/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17094/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17094/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: 

[jira] [Commented] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API

2016-10-10 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564190#comment-15564190
 ] 

Surendra Singh Lilhore commented on HDFS-10991:
---

Thanks [~James C] for review..

> libhdfs :  Client compilation is failing for hdfsTruncateFile API
> -
>
> Key: HDFS-10991
> URL: https://issues.apache.org/jira/browse/HDFS-10991
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-10991.patch
>
>
> {noformat}
> /tmp/ccJNUj6m.o: In function `main':
> test.c:(.text+0x812): undefined reference to `hdfsTruncateFile'
> collect2: ld returned 1 exit status
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API

2016-10-10 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-10991:
--
Affects Version/s: 2.7.0

> libhdfs :  Client compilation is failing for hdfsTruncateFile API
> -
>
> Key: HDFS-10991
> URL: https://issues.apache.org/jira/browse/HDFS-10991
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.7.0
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-10991.patch
>
>
> {noformat}
> /tmp/ccJNUj6m.o: In function `main':
> test.c:(.text+0x812): undefined reference to `hdfsTruncateFile'
> collect2: ld returned 1 exit status
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564168#comment-15564168
 ] 

Brahma Reddy Battula commented on HDFS-10988:
-

[~liuml07] thanks for review and commit.

> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564161#comment-15564161
 ] 

Hadoop QA commented on HDFS-10986:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 
20s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10986 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832579/HDFS-10986.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ba22a0b56351 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17093/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17093/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, 
> HDFS-10986.001.patch, HDFS-10986.002.patch
>
>
> There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
> very limited error message, if any, to the stderr.
> {code}
> 

[jira] [Commented] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones

2016-10-10 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564152#comment-15564152
 ] 

Hanisha Koneru commented on HDFS-10906:
---

Thank you [~xyao]. I have added two tests to cover the list of test cases.

> Add unit tests for Trash with HDFS encryption zones
> ---
>
> Key: HDFS-10906
> URL: https://issues.apache.org/jira/browse/HDFS-10906
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.8.0
>Reporter: Xiaoyu Yao
>Assignee: Hanisha Koneru
> Attachments: HDFS-10906.000.patch
>
>
> The goal is to improve unit test coverage for HDFS trash with encryption zone 
> especially under Kerberos environment. The current unit test 
> TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on 
> non-Kerberos case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones

2016-10-10 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-10906:
--
Status: Patch Available  (was: Open)

> Add unit tests for Trash with HDFS encryption zones
> ---
>
> Key: HDFS-10906
> URL: https://issues.apache.org/jira/browse/HDFS-10906
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.8.0
>Reporter: Xiaoyu Yao
>Assignee: Hanisha Koneru
> Attachments: HDFS-10906.000.patch
>
>
> The goal is to improve unit test coverage for HDFS trash with encryption zone 
> especially under Kerberos environment. The current unit test 
> TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on 
> non-Kerberos case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-10-10 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564153#comment-15564153
 ] 

Jingcheng Du commented on HDFS-9668:


Thanks [~eddyxu]!
I will update the patch after HADOOP-13702 is committed to address the comments.

I guess I can co-operate this JIRA with HDFS-10804.
In the latest patch of this JIRA, I did some similar things, I think the 
changes can address the concerns in HDFS-10804. Thanks.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, 
> HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, 
> HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, 
> HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-2.patch, HDFS-9668-3.patch, 
> HDFS-9668-4.patch, HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, 
> HDFS-9668-8.patch, HDFS-9668-9.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase when many wal/flusher/compactor are configured.
> 

[jira] [Updated] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones

2016-10-10 Thread Hanisha Koneru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDFS-10906:
--
Attachment: HDFS-10906.000.patch

> Add unit tests for Trash with HDFS encryption zones
> ---
>
> Key: HDFS-10906
> URL: https://issues.apache.org/jira/browse/HDFS-10906
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.8.0
>Reporter: Xiaoyu Yao
>Assignee: Hanisha Koneru
> Attachments: HDFS-10906.000.patch
>
>
> The goal is to improve unit test coverage for HDFS trash with encryption zone 
> especially under Kerberos environment. The current unit test 
> TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on 
> non-Kerberos case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10986:
-
Attachment: HDFS-10986-branch-2.8.002.patch

> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, 
> HDFS-10986.001.patch, HDFS-10986.002.patch
>
>
> There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
> very limited error message, if any, to the stderr.
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
> Datanode unreachable.
> $ hdfs dfsadmin -getDatanodeInfo localhost:9866
> Datanode unreachable.
> $ hdfs dfsadmin -evictWriters 127.0.0.1:9866
> $ echo $?
> -1
> {code}
> User is not able to get the exception stack even the LOG level is DEBUG. This 
> is not very user friendly. Fortunately, if the port number is not accessible 
> (say ), users can infer the detailed error message by IPC logs:
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
> 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> .
> 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
> localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed 
> retries number: 10
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> ...
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
> Datanode unreachable.
> {code}
> We should fix this by providing detailed error message. Actually, the 
> {{DFSAdmin#run}} already handles exception carefully, including:
> # set the exit ret value to -1
> # print the error message
> # log the exception stack trace (in DEBUG level)
> All we need to do is to not swallow exceptions without good reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10986:
-
Attachment: HDFS-10986.002.patch

> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, 
> HDFS-10986.001.patch, HDFS-10986.002.patch
>
>
> There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
> very limited error message, if any, to the stderr.
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
> Datanode unreachable.
> $ hdfs dfsadmin -getDatanodeInfo localhost:9866
> Datanode unreachable.
> $ hdfs dfsadmin -evictWriters 127.0.0.1:9866
> $ echo $?
> -1
> {code}
> User is not able to get the exception stack even the LOG level is DEBUG. This 
> is not very user friendly. Fortunately, if the port number is not accessible 
> (say ), users can infer the detailed error message by IPC logs:
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
> 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> .
> 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
> localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed 
> retries number: 10
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> ...
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
> Datanode unreachable.
> {code}
> We should fix this by providing detailed error message. Actually, the 
> {{DFSAdmin#run}} already handles exception carefully, including:
> # set the exit ret value to -1
> # print the error message
> # log the exception stack trace (in DEBUG level)
> All we need to do is to not swallow exceptions without good reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10986:
-
Target Version/s: 2.8.0  (was: 3.0.0-alpha2)

> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10986.000.patch, HDFS-10986.001.patch
>
>
> There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
> very limited error message, if any, to the stderr.
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
> Datanode unreachable.
> $ hdfs dfsadmin -getDatanodeInfo localhost:9866
> Datanode unreachable.
> $ hdfs dfsadmin -evictWriters 127.0.0.1:9866
> $ echo $?
> -1
> {code}
> User is not able to get the exception stack even the LOG level is DEBUG. This 
> is not very user friendly. Fortunately, if the port number is not accessible 
> (say ), users can infer the detailed error message by IPC logs:
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
> 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> .
> 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
> localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed 
> retries number: 10
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> ...
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
> Datanode unreachable.
> {code}
> We should fix this by providing detailed error message. Actually, the 
> {{DFSAdmin#run}} already handles exception carefully, including:
> # set the exit ret value to -1
> # print the error message
> # log the exception stack trace (in DEBUG level)
> All we need to do is to not swallow exceptions without good reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10972:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to {{branch-2.8}}. Thanks for your contribution, [~xiaobingo].

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, 
> HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10933) Refactor TestFsck

2016-10-10 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564047#comment-15564047
 ] 

Takanobu Asanuma commented on HDFS-10933:
-

Thank you for reviewing and committing, [~jojochuang]! I will create a branch-2 
patch soon. 

> Refactor TestFsck
> -
>
> Key: HDFS-10933
> URL: https://issues.apache.org/jira/browse/HDFS-10933
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Minor
> Attachments: HDFS-10933.1.patch, HDFS-10933.2.patch, 
> HDFS-10933.3.patch, HDFS-10933.WIP.1.patch
>
>
> {{TestFsck}} should be refactored.
> - use @Before @After annotations
> - improve loggings
> - fix checkstyle warnings
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563889#comment-15563889
 ] 

Mingliang Liu edited comment on HDFS-10965 at 10/11/16 12:43 AM:
-

{code}
384   /* init reused vars */
385   List outs = null;
386   int ret;
387 
388   /**
389* test normal run
390*/
{code}
No reuse found. Make them final.

{code}
400   assertEquals(
401   "three lines per Datanode: the 1st line is rack info, 2nd 
node info,"
402   + " 3rd empty line.",
403   12, outs.size());
{code}
"There should be three lines per Datanode: the 1st line "
12 -> 3 * numDn

{code}
376 /* init cluster using topology */
377 try (MiniDFSCluster miniCluster = new 
MiniDFSCluster.Builder(dfsConf)
378 .numDataNodes(numDn).racks(racks).build()) {
{code}
You created a new MiniDFSCluster in the test using the default cluster 
directory, which conflicts with the pre-setup class variable cluster (e.g. not 
able to find the edits dir etc). The reason is that the MiniDFSCluster will 
format every time we build a new one. Please have a look at [HDFS-10986] for 
more information to use the pre-set {{cluster}}.



was (Author: liuml07):
{code}
384   /* init reused vars */
385   List outs = null;
386   int ret;
387 
388   /**
389* test normal run
390*/
{code}
No reuse found. Make them final.

{code}
400   assertEquals(
401   "three lines per Datanode: the 1st line is rack info, 2nd 
node info,"
402   + " 3rd empty line.",
403   12, outs.size());
{code}
"There should be three lines per Datanode: the 1st line "
12 -> 3 * numDn

Otherwise +1 pending on Jenkins.


> Add unit test for HDFS command 'dfsadmin -printTopology'
> 
>
> Key: HDFS-10965
> URL: https://issues.apache.org/jira/browse/HDFS-10965
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, 
> HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch
>
>
> DFSAdmin#printTopology should also be tested. This proposes adding it in 
> TestDFSAdmin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10986:
-
Attachment: HDFS-10986.001.patch

Thanks for your review, [~brahmareddy]. The v1 patch is to address both of your 
comments (which are very valid).

[~xiaobingo] In [HDFS-10972] you created a new MiniDFSCluster in the test using 
the default cluster directory, which conflicts with the pre-setup class 
variable {{cluster}} (e.g. not able to find the edits dir etc). The reason is 
that the MiniDFSCluster will format every time we build a new one. Your test 
can pass regardless of this. This v1 patch, while adding more failing cases, 
also addressed that problem. Please confirm and review the test here. Thanks.

> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10986.000.patch, HDFS-10986.001.patch
>
>
> There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
> very limited error message, if any, to the stderr.
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
> Datanode unreachable.
> $ hdfs dfsadmin -getDatanodeInfo localhost:9866
> Datanode unreachable.
> $ hdfs dfsadmin -evictWriters 127.0.0.1:9866
> $ echo $?
> -1
> {code}
> User is not able to get the exception stack even the LOG level is DEBUG. This 
> is not very user friendly. Fortunately, if the port number is not accessible 
> (say ), users can infer the detailed error message by IPC logs:
> {code}
> $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
> 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> .
> 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
> localhost/127.0.0.1:. Already tried 9 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
> localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed 
> retries number: 10
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> ...
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
> Datanode unreachable.
> {code}
> We should fix this by providing detailed error message. Actually, the 
> {{DFSAdmin#run}} already handles exception carefully, including:
> # set the exit ret value to -1
> # print the error message
> # log the exception stack trace (in DEBUG level)
> All we need to do is to not swallow exceptions without good reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10903) Replace config key literal strings with config key names II: hadoop hdfs

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564009#comment-15564009
 ] 

Mingliang Liu commented on HDFS-10903:
--

{{ipc.client.connect.max.retries}} is used in some tests as string literals 
(e.g. {{TestFileAppend4}}), please also address them in this JIRA. Thanks.

> Replace config key literal strings with config key names II: hadoop hdfs
> 
>
> Key: HDFS-10903
> URL: https://issues.apache.org/jira/browse/HDFS-10903
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Mingliang Liu
>Assignee: Chen Liang
>Priority: Minor
> Attachments: HADOOP-13644.001.patch, HDFS-10903.002.patch
>
>
> In *Hadoop HDFS*, there are some places that use config key literal strings 
> instead of config key names, e.g.
> {code:title=IOUtils.java}
> copyBytes(in, out, conf.getInt("io.file.buffer.size", 4096), true);
> {code}
> We should replace places like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563986#comment-15563986
 ] 

Hadoop QA commented on HDFS-10965:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 1s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 57m 
43s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10965 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832554/HDFS-10965.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a7b78250c754 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17092/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17092/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add unit test for HDFS command 'dfsadmin -printTopology'
> 
>
> Key: HDFS-10965
> URL: https://issues.apache.org/jira/browse/HDFS-10965
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, 
> HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch
>
>
> DFSAdmin#printTopology should also be tested. This proposes adding it in 
> TestDFSAdmin.



--
This 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563969#comment-15563969
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 43s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new 
+ 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tools.TestHdfsConfigFields |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832548/HDFS-10967.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 7bb128d54181 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17091/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| unit | 

[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563946#comment-15563946
 ] 

Hadoop QA commented on HDFS-10972:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
50s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}172m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_101 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
| JDK v1.7.0_111 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Issue | HDFS-10972 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832539/HDFS-10972-branch-2.8.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bca529a9fc97 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 

[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10986:
-
Description: 
There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
very limited error message, if any, to the stderr.

{code}
$ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
Datanode unreachable.
$ hdfs dfsadmin -getDatanodeInfo localhost:9866
Datanode unreachable.
$ hdfs dfsadmin -evictWriters 127.0.0.1:9866
$ echo $?
-1
{code}

User is not able to get the exception stack even the LOG level is DEBUG. This 
is not very user friendly. Fortunately, if the port number is not accessible 
(say ), users can infer the detailed error message by IPC logs:
{code}
$ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
.
2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:. Already tried 9 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed 
retries number: 10
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
...
at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
Datanode unreachable.
{code}

We should fix this by providing detailed error message. Actually, the 
{{DFSAdmin#run}} already handles exception carefully, including:
# set the exit ret value to -1
# print the error message
# log the exception stack trace (in DEBUG level)

All we need to do is to not swallow exceptions without good reason.

  was:
There are some subcommands in {{DFSAdmin}} that swallow IOException and give 
very limited error message, if any, to the stderr.

{code}
$ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866
Datanode unreachable.
$ hdfs dfsadmin -getDatanodeInfo localhost:9866
Datanode unreachable.
$ hdfs dfsadmin -evictWriters 127.0.0.1:9866
$ echo $?
-1
{code}

User is not able to get the exception stack even the LOG level is DEBUG. This 
is not very user friendly. Fortunately, if the port number is not accessible 
(say ), users can infer the detailed error message by IPC logs:
{code}
$ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:
2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9690. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
.
2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9690. Already tried 9 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: 
localhost/127.0.0.1:9690: retries get failed due to exceeded maximum allowed 
retries number: 10
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
...
at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225)
Datanode unreachable.
{code}

We should fix this by providing detailed error message. Actually, the 
{{DFSAdmin#run}} already handles exception carefully, including:
# set the exit ret value to -1
# print the error message
# log the exception stack trace (in DEBUG level)

All we need to do is to not swallow exceptions without good reason.


> DFSAdmin should log detailed error message if any
> -
>
> Key: HDFS-10986
> URL: https://issues.apache.org/jira/browse/HDFS-10986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>   

[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563889#comment-15563889
 ] 

Mingliang Liu commented on HDFS-10965:
--

{code}
384   /* init reused vars */
385   List outs = null;
386   int ret;
387 
388   /**
389* test normal run
390*/
{code}
No reuse found. Make them final.

{code}
400   assertEquals(
401   "three lines per Datanode: the 1st line is rack info, 2nd 
node info,"
402   + " 3rd empty line.",
403   12, outs.size());
{code}
"There should be three lines per Datanode: the 1st line "
12 -> 3 * numDn

Otherwise +1 pending on Jenkins.


> Add unit test for HDFS command 'dfsadmin -printTopology'
> 
>
> Key: HDFS-10965
> URL: https://issues.apache.org/jira/browse/HDFS-10965
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, 
> HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch
>
>
> DFSAdmin#printTopology should also be tested. This proposes adding it in 
> TestDFSAdmin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563882#comment-15563882
 ] 

Konstantin Shvachko commented on HDFS-10967:


This could be good optimization of placement policy for other than the first 
replicas when some nodes are close to full.
This should only be turned on though in heterogeneous environment. With 
homogeneous nodes where node usage is balanced this will be just a performance 
overhead.
Few suggestions on the patch
# Set the default for remaining capacity threshold that triggers new behavior 
to {{0.00}}. Homogeneous clusters should not require reconfiguration.
# The config variable is more like a "threshold" rather than a "factor" as in 
{{considerLoad}}. So may be call it 
{{dfs.namenode.replication.considerCapacity.threshold}}.
# I would suggest to use only one configuration variable. The Boolean 
{{considerCapacity}} is essentially redundant.
# Would be good to have a JavaDoc for the new config variable and for the 
method, where the feature is triggered.
# Did you try to add {{isNearFull()}} call into {{isGoodDatanode()}}? Then you 
will not need to retry 3 times.
# JavaDoc for the unit test would be also useful.

A separate discussion point whether we should add a separate admin command for 
every configuration nob we introduce? This doesn't feel right. May be we should 
have a command, say {{refreshConfiguration()}}, which re-reads config, updates 
variables and logs what it updated. The main problem with admin command is that 
it does not change configuration and when you restart the service or fail it 
over you loose the set value. So may be anything that is controlled by 
configuration should be updated through configuration and the 
{{refreshConfiguration()}} call. We could have merged this with 
{{setBalancerBandwidth()}} some time.
Alternatively, we can use failover to take effect of new configuration values. 
Then we don't need new admin commands at all.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10977) Balancer should query NameNode with a timeout

2016-10-10 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563883#comment-15563883
 ] 

Zhe Zhang commented on HDFS-10977:
--

Thanks [~senthilec566] for reporting the similar issue. Yes it's possible that 
decomm nodes were causing the infinite delay, in which case {{-include}} is a 
good workaround. I'll try that in our cluster.

But in general, I think we still should add the timeout logic. E.g. the RPC 
request could encounter issues on the network layer.

> Balancer should query NameNode with a timeout
> -
>
> Key: HDFS-10977
> URL: https://issues.apache.org/jira/browse/HDFS-10977
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: Gmail - HDFS Balancer Stuck after 10 Minz.pdf, 
> HDFS-10977-reproduce.patch
>
>
> We found a case where {{Dispatcher}} was stuck at {{getBlockList}} *forever* 
> (well, several hours when we found it).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Description: 
{{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time 
sleep before assertions. This may fail sometimes though 10 seconds are 
generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
retry the assertions. Meanwhile, it usually does not need 10 seconds to reach 
the condition (<1s in my laptop). Removing the 10s sleep will also make the UT 
run faster.

This is also true to {{TestZKFailoverController#testGracefulFailover}}.

  was:
{{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time 
sleep before assertions. This may fail sometimes though 10 seconds are 
generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
retry the assertions.

This is also true to {{TestZKFailoverController#testGracefulFailover}}.


> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions. Meanwhile, it usually does not need 10 seconds to reach 
> the condition (<1s in my laptop). Removing the 10s sleep will also make the 
> UT run faster.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563835#comment-15563835
 ] 

Zhe Zhang edited comment on HDFS-10967 at 10/10/16 11:07 PM:
-

Above Jenkins report was for v2 patch. We'll see one for v3 soon. I verified 
reported test failures and could only reproduce {{TestHdfsConfigFields}}. Once 
we agree on the overall structure I'll do the due diligence:
# Add the item to {{hdfs-default.xml}}
# Document the new dfsAdmin command
# Clear up the checkStyle warnings


was (Author: zhz):
I verified reported test failures and could only reproduce 
{{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the 
due diligence:
# Add the item to {{hdfs-default.xml}}
# Document the new dfsAdmin command
# Clear up the checkStyle warnings

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563835#comment-15563835
 ] 

Zhe Zhang commented on HDFS-10967:
--

I verified reported test failures and could only reproduce 
{{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the 
due diligence:
# Add the item to {{hdfs-default.xml}}
# Document the new dfsAdmin command
# Clear up the checkStyle warnings

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10629) Federation Router

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563827#comment-15563827
 ] 

Hadoop QA commented on HDFS-10629:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
23s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-10467 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 394 unchanged - 0 fixed = 396 total (was 394) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 54s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Sequence of calls to java.util.concurrent.ConcurrentHashMap may not be 
atomic in 
org.apache.hadoop.hdfs.server.federation.router.ConnectionManager.getConnection(UserGroupInformation,
 String)  At ConnectionManager.java:may not be atomic in 
org.apache.hadoop.hdfs.server.federation.router.ConnectionManager.getConnection(UserGroupInformation,
 String)  At ConnectionManager.java:[line 151] |
|  |  
org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration,
 boolean) invokes System.exit(...), which shuts down the entire virtual machine 
 At Router.java:shuts down the entire virtual machine  At Router.java:[line 
123] |
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10629 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832545/HDFS-10629-HDFS-10467-007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 96155d4d43ed 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Commented] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563816#comment-15563816
 ] 

Hadoop QA commented on HDFS-10984:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 185 unchanged - 0 fixed = 188 total (was 185) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10984 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832544/HDFS-10984.v3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 349de580663e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c874fa9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17089/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17089/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17089/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17089/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Expose nntop output as metrics   
> 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563810#comment-15563810
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new 
+ 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832542/HDFS-10967.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux a91a71c163fa 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c874fa9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Commented] (HDFS-10637) Modifications to remove the assumption that FsVolumes are backed by java.io.File.

2016-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563788#comment-15563788
 ] 

Hudson commented on HDFS-10637:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10583 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10583/])
HDFS-10637. Modifications to remove the assumption that FsVolumes are (lei: rev 
96b12662ea76e3ded4ef13944fc8df206cfb4613)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancerWithMockMover.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/VolumeScanner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/VolumeFailureInfo.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DiskBalancer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockScanner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImplBuilder.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StorageLocation.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/LocalReplica.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetAsyncDiskService.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


> Modifications to remove the assumption that FsVolumes are backed by 
> java.io.File.
> -
>
> 

[jira] [Updated] (HDFS-10637) Modifications to remove the assumption that FsVolumes are backed by java.io.File.

2016-10-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-10637:
-
  Resolution: Fixed
Hadoop Flags: Incompatible change
   Fix Version/s: 3.0.0-alpha2
Target Version/s: 3.0.0-alpha1
  Status: Resolved  (was: Patch Available)

+1. The checkstyle warnings were existed ones, caused by moving code around.

Committed to trunk. 

Thanks for the great work, [~virajith].


> Modifications to remove the assumption that FsVolumes are backed by 
> java.io.File.
> -
>
> Key: HDFS-10637
> URL: https://issues.apache.org/jira/browse/HDFS-10637
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10637.001.patch, HDFS-10637.002.patch, 
> HDFS-10637.003.patch, HDFS-10637.004.patch, HDFS-10637.005.patch, 
> HDFS-10637.006.patch, HDFS-10637.007.patch, HDFS-10637.008.patch, 
> HDFS-10637.009.patch, HDFS-10637.010.patch, HDFS-10637.011.patch
>
>
> Modifications to {{FsVolumeSpi}} and {{FsVolumeImpl}} to remove references to 
> {{java.io.File}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'

2016-10-10 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563675#comment-15563675
 ] 

Xiaobing Zhou commented on HDFS-10965:
--

v004 is posted, which is based on some work committed in HDFS-10972.

> Add unit test for HDFS command 'dfsadmin -printTopology'
> 
>
> Key: HDFS-10965
> URL: https://issues.apache.org/jira/browse/HDFS-10965
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, 
> HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch
>
>
> DFSAdmin#printTopology should also be tested. This proposes adding it in 
> TestDFSAdmin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'

2016-10-10 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10965:
-
Attachment: HDFS-10965.004.patch

> Add unit test for HDFS command 'dfsadmin -printTopology'
> 
>
> Key: HDFS-10965
> URL: https://issues.apache.org/jira/browse/HDFS-10965
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, 
> HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch
>
>
> DFSAdmin#printTopology should also be tested. This proposes adding it in 
> TestDFSAdmin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563541#comment-15563541
 ] 

Zhe Zhang edited comment on HDFS-10967 at 10/10/16 9:48 PM:


Thanks [~mingma] for the feedback!

Attaching v2 patch to add dfsAdmin command to update the config without NN 
restart.

bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses 
absolute free space, maybe that needs to be change to percentage based?
Agreed. In general I think all capacity-based decisions (Balancer, placement 
policy etc) should be consistent. Since the current patch is already big let's 
address this issue separately?

bq. What if we move this new policy to isGoodDatanode?
That's a good thought. The challenge is that this capacity consideration is not 
a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is 
not an ideal candidate from balancing perspective, but if there's no other 
valid candidates, it should still be considered. I think I found a way to apply 
the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But 
it requires some additional refactors. Will attach v3 patch soon.


was (Author: zhz):
Thanks [~mingma] for the feedback!

Attaching v2 patch to add dfsAdmin command to update the config without NN 
restart.

bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses 
absolute free space, maybe that needs to be change to percentage based?
Agreed. In general I think all capacity-based decisions (Balancer, placement 
policy etc) should be consistent.

bq. What if we move this new policy to isGoodDatanode?
That's a good thought. The challenge is that this capacity consideration is not 
a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is 
not an ideal candidate from balancing perspective, but if there's no other 
valid candidates, it should still be considered. I think I found a way to apply 
the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But 
it requires some additional refactors. Will attach v3 patch soon.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10967:
-
Attachment: HDFS-10967.03.patch

Attaching v3 patch to include remote writer and 
{{BlockPlacementPolicyRackFaultTolerant}} scenarios as [~mingma] suggested 
above.

When writing v3 patch I realized an issue in v2 patch that if a DN is 
considered in a random try, it will be put into excluded nodes list. v3 patch 
addresses that issue as well.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API

2016-10-10 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563615#comment-15563615
 ] 

James Clampffer commented on HDFS-10991:


I normally stick to the HDFS-8707 branch so I'm not sure if my +1 counts here, 
but if it does this seems like a very straightforward fix, +1.

> libhdfs :  Client compilation is failing for hdfsTruncateFile API
> -
>
> Key: HDFS-10991
> URL: https://issues.apache.org/jira/browse/HDFS-10991
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-10991.patch
>
>
> {noformat}
> /tmp/ccJNUj6m.o: In function `main':
> test.c:(.text+0x812): undefined reference to `hdfsTruncateFile'
> collect2: ld returned 1 exit status
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10629) Federation Router

2016-10-10 Thread Jason Kace (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Kace updated HDFS-10629:
--
Attachment: HDFS-10629-HDFS-10467-007.patch

Updating patch to include:

1) Releasing router->NN client connections after use, prevents the addition of 
unnecessary connections to the pool.

2) Separating ConnectionManager and ConnectionPool into separate java files.

3) Fixing most findbugs issues and some style items.

> Federation Router
> -
>
> Key: HDFS-10629
> URL: https://issues.apache.org/jira/browse/HDFS-10629
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Inigo Goiri
>Assignee: Jason Kace
> Attachments: HDFS-10629-HDFS-10467-002.patch, 
> HDFS-10629-HDFS-10467-003.patch, HDFS-10629-HDFS-10467-004.patch, 
> HDFS-10629-HDFS-10467-005.patch, HDFS-10629-HDFS-10467-006.patch, 
> HDFS-10629-HDFS-10467-007.patch, HDFS-10629.000.patch, HDFS-10629.001.patch
>
>
> Component that routes calls from the clients to the right Namespace. It 
> implements {{ClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Patch Available  (was: Open)

Re-submitting patch for review + tests.

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch, HDFS-10984.v3.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: HDFS-10984.v3.patch

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch, HDFS-10984.v3.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: (was: HDFS-10984.v3.patch)

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch, HDFS-10984.v3.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: HDFS-10984.v3.patch

Addressed review comments from [~xyao].

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch, HDFS-10984.v3.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10967:
-
Attachment: HDFS-10967.02.patch

Thanks [~mingma] for the feedback!

Attaching v2 patch to add dfsAdmin command to update the config without NN 
restart.

bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses 
absolute free space, maybe that needs to be change to percentage based?
Agreed. In general I think all capacity-based decisions (Balancer, placement 
policy etc) should be consistent.

bq. What if we move this new policy to isGoodDatanode?
That's a good thought. The challenge is that this capacity consideration is not 
a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is 
not an ideal candidate from balancing perspective, but if there's no other 
valid candidates, it should still be considered. I think I found a way to apply 
the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But 
it requires some additional refactors. Will attach v3 patch soon.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563538#comment-15563538
 ] 

Mingliang Liu commented on HDFS-10972:
--

+1 pending on Jenkins.

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, 
> HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Open  (was: Patch Available)

Cancelling to incorporate review comments.

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563493#comment-15563493
 ] 

Hudson commented on HDFS-10985:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10582 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10582/])
HDFS-10985. o.a.h.ha.TestZKFailoverController should not use fixed time 
(liuml07: rev c874fa914dfbf07d1731f5e87398607366675879)
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverController.java


> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563487#comment-15563487
 ] 

Xiaobing Zhou commented on HDFS-10972:
--

branch-2.8 patch is posted [~liuml07] thanks.

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, 
> HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10972:
-
Attachment: HDFS-10972-branch-2.8.003.patch

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, 
> HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563120#comment-15563120
 ] 

Ming Ma edited comment on HDFS-10967 at 10/10/16 8:56 PM:
--

Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous 
nodes, and the rack-based assumption makes sense to me.

* For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses 
absolute free space, maybe that needs to be change to percentage based?
* What if we move this new policy to {{isGoodDatanode}}? It has several 
benefits:
** BlockPlacementPolicyRackFaultTolerant can use it.
** Cover the case where the writer is outside of the cluster, thus the call 
path is chooseLocalRack -> chooseRandom.



was (Author: mingma):
Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous 
nodes, and the rack-based assumption makes sense to me.

* For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses 
absolute free space, maybe that needs to be change to percentage based?
* What if we move this new policy to {{isGoodDatanode}}? It has several 
benefits:
** BlockPlacementPolicyRackFaultTolerant can uses it.
** Cover the case where the writer is outside of the cluster, thus the call 
path is chooseLocalRack -> chooseRandom.
* Typo below you meant this.considerCapacity.
{noformat}
this.considerLoad = conf.getBoolean(...);
{noformat}



> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}} through {{branch-2.8}}. Thanks [~ste...@apache.org] for 
review.

> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563430#comment-15563430
 ] 

Hudson commented on HDFS-10988:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10581 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10581/])
HDFS-10988. Refactor TestBalancerBandwidth. Contributed by Brahma Reddy 
(liuml07: rev b963818621c200160bb37624f177bdcb059de4eb)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBalancerBandwidth.java


> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10988:
-
Component/s: balancer & mover

> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10988:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}} through {{branch-2.8}}. Thanks for your contribution, 
[~brahmareddy].

> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover, test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557062#comment-15557062
 ] 

Siddharth Wagle edited comment on HDFS-10984 at 10/10/16 8:25 PM:
--

Sample flattened output of the window metrics emitted to Ambari Metrics System:

{quote}
dfs.NNTopUserOpCounts.windowMs=6.op=listStatus.user=mapred.count
{quote}


was (Author: swagle):
Sample flattened output of the window metrics emitted to Ambari Metrics System:

{quote}
dfs.TopUserOpCounts.windowMs=6.op=listStatus.user=mapred.count
{quote}

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563376#comment-15563376
 ] 

Xiaoyu Yao commented on HDFS-10984:
---

bq. Confused about point 4. I have linked the two tickets, did you mean a Jira 
comment or something for the javadoc as a class level commentary?

I mean the description of the JIRA "The nntop output is already exposed via JMX 
with HDFS-6982."

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563351#comment-15563351
 ] 

Hadoop QA commented on HDFS-10984:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 8 new + 185 unchanged - 0 fixed = 193 total (was 185) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.tools.TestDFSAdmin |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10984 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832522/HDFS-10984.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 358780967afa 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3441c74 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17086/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17086/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17086/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17086/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563326#comment-15563326
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 10 new + 452 unchanged - 0 fixed = 462 total (was 452) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks 
|
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
|
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832518/HDFS-10967.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0693f663ee00 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cef61d5 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add configuration for 

[jira] [Commented] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563318#comment-15563318
 ] 

Siddharth Wagle commented on HDFS-10984:


Thanks [~xyao] for the review comments, I will work on incorporating them into 
the patch.

Confused about point 4. I have linked the two tickets, did you mean a Jira 
comment or something for the javadoc as a class level commentary?

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-10-10 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563305#comment-15563305
 ] 

Daryn Sharp commented on HDFS-10301:


Will take a look this afternoon.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, 
> HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, 
> HDFS-10301.012.patch, HDFS-10301.013.patch, HDFS-10301.014.patch, 
> HDFS-10301.015.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563289#comment-15563289
 ] 

Hadoop QA commented on HDFS-10985:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 9s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
10s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 47 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
22s{color} | {color:green} hadoop-common in the patch passed with JDK 
v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Issue | HDFS-10985 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832516/HDFS-10985-branch-2.8.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 897f5c503ae1 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.8 / 583283d |
| Default Java | 1.7.0_111 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 

[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563210#comment-15563210
 ] 

Steve Loughran commented on HDFS-10985:
---

+1

> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563190#comment-15563190
 ] 

Xiaoyu Yao edited comment on HDFS-10984 at 10/10/16 7:16 PM:
-

Thanks [~swagle] for reporting the issue and posting the patch. The latest 
patch v2 with the JMXGet issue fixed looks good to me overall. 
Here are a few minor issues:

1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not 
like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that 
the metrics source works as expected when top metrics is not enabled.

2. I notice that the patch added a metric record per window. So there would be 
multiple record per getMetrics() call for each window. Can we elaborate this in 
the description of the comments? 

3. Checkstyle issue from Jenkins.

4. You may also add context info into this ticket. 
The original HDFS-6982 has a Metrics2 source implemented but was removed as 
part of HDFS-7426 "Change nntop JMX format to be a JSON blob".


was (Author: xyao):
Thanks [~swagle] for reporting the issue and posting the patch. The latest 
patch v2 with the JMXGet issue fixed looks good to me overall. 
Here are a few minor issues:

1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not 
like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that 
the metrics source works as expected when top metrics is not enabled.

2. I notice that the patch added a metric record per window. So there would be 
multiple record per getMetrics() call for each window. Can we elaborate this in 
the description of the comments? 

3. Checkstyle issue from Jenkins.

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563190#comment-15563190
 ] 

Xiaoyu Yao commented on HDFS-10984:
---

Thanks [~swagle] for reporting the issue and posting the patch. The latest 
patch v2 with the JMXGet issue fixed looks good to me overall. 
Here are a few minor issues:

1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not 
like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that 
the metrics source works as expected when top metrics is not enabled.

2. I notice that the patch added a metric record per window. So there would be 
multiple record per getMetrics() call for each window. Can we elaborate this in 
the description of the comments? 

3. Checkstyle issue from Jenkins.

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563164#comment-15563164
 ] 

Hudson commented on HDFS-10972:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10578 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10578/])
HDFS-10972. Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'. 
(liuml07: rev 3441c746b5f35c46fca5a0f252c86c8357fe932e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java


> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, 
> HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10972:
-
Fix Version/s: 3.0.0-alpha2

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, 
> HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563140#comment-15563140
 ] 

Mingliang Liu commented on HDFS-10972:
--

Committed to {{trunk}} and {{branch-2}}. Can you provide a patch for 
{{branch-2.8}}, if it applies? Thanks.

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, 
> HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563120#comment-15563120
 ] 

Ming Ma commented on HDFS-10967:


Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous 
nodes, and the rack-based assumption makes sense to me.

* For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses 
absolute free space, maybe that needs to be change to percentage based?
* What if we move this new policy to {{isGoodDatanode}}? It has several 
benefits:
** BlockPlacementPolicyRackFaultTolerant can uses it.
** Cover the case where the writer is outside of the cluster, thus the call 
path is chooseLocalRack -> chooseRandom.
* Typo below you meant this.considerCapacity.
{noformat}
this.considerLoad = conf.getBoolean(...);
{noformat}



> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563109#comment-15563109
 ] 

Mingliang Liu commented on HDFS-10988:
--

Test failure is not related.

> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563108#comment-15563108
 ] 

Mingliang Liu commented on HDFS-10988:
--

+1 Will commit shortly.

> Refactor TestBalancerBandwidth
> --
>
> Key: HDFS-10988
> URL: https://issues.apache.org/jira/browse/HDFS-10988
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10988-002.patch, HDFS-10988.patch
>
>
> This jira will deal the following.
> 1) Remove Fixed sleep
> 2) Remove unused dnproxy
> 3) use try with resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Patch Available  (was: Open)

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: HDFS-10984.v2.patch

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563096#comment-15563096
 ] 

Hadoop QA commented on HDFS-10985:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
32s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10985 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832511/HDFS-10985.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9e14d3db9866 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cef61d5 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17082/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17082/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> 

[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: (was: HDFS-10984.v2.patch)

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Open  (was: Patch Available)

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563086#comment-15563086
 ] 

Mingliang Liu commented on HDFS-10972:
--

Test failures are not related.

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, 
> HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'

2016-10-10 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563084#comment-15563084
 ] 

Mingliang Liu commented on HDFS-10972:
--

+1 Will commit shortly.

> Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
> --
>
> Key: HDFS-10972
> URL: https://issues.apache.org/jira/browse/HDFS-10972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, shell, test
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, 
> HDFS-10972.002.patch, HDFS-10972.003.patch
>
>
> getDatanodeInfo should be tested in admin CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10967:
-
Attachment: HDFS-10967.01.patch

Updating patch to include a unit test.

Since the new config knob is meant to be used when the cluster is already 
imbalanced, I'll add a dfsAdmin command in the next rev to change the config 
value without restarting NN.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10967:
-
Attachment: (was: HDFS-10967.poc.patch)

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-10-10 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563049#comment-15563049
 ] 

Arpit Agarwal commented on HDFS-10301:
--

Hi [~shv], the v15 patch lgtm. Thank you for waiting. Assuming Daryn is okay 
with this approach we can commit it.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, 
> HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, 
> HDFS-10301.012.patch, HDFS-10301.013.patch, HDFS-10301.014.patch, 
> HDFS-10301.015.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Attachment: HDFS-10985-branch-2.8.001.patch

Thank you [~ste...@apache.org] for your review. Can you also have a look at the 
v1 patch that addressed one more place in the test?

> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, 
> HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Patch Available  (was: Open)

Fixed JMXGet issue, verified that it works locally with several runs.

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Status: Open  (was: Patch Available)

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10984) Expose nntop output as metrics

2016-10-10 Thread Siddharth Wagle (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-10984:
---
Attachment: HDFS-10984.v2.patch

> Expose nntop output as metrics   
> -
>
> Key: HDFS-10984
> URL: https://issues.apache.org/jira/browse/HDFS-10984
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
> Fix For: 2.7.3
>
> Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, 
> HDFS-10984.v2.patch
>
>
> The nntop output is already exposed via JMX with HDFS-6982.
> However external metrics systems do not get this data. It would be valuable 
> to track this as a timeseries as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Description: 
{{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time 
sleep before assertions. This may fail sometimes though 10 seconds are 
generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
retry the assertions.

This is also true to {{TestZKFailoverController#testGracefulFailover}}.

  was:
{{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time 
sleep before assertions. This may fail sometimes though 10 seconds are 
generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
retry the assertions.

If this makes sense, we can address all other places in 
{{TestZKFailoverController}}, including {{testGracefulFailover}} and 
{{testDontFailoverToUnhealthyNode}}.

Summary: o.a.h.ha.TestZKFailoverController should not use fixed time 
sleep before assertions  (was: o.a.h.ha.TestZKFailoverController should not use 
fixed time sleep before assertsions)

> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertions
> ---
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10985.000.patch, HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> This is also true to {{TestZKFailoverController#testGracefulFailover}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-10-10 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563010#comment-15563010
 ] 

Lei (Eddy) Xu commented on HDFS-9668:
-

Hi, [~jingcheng...@intel.com]

Thanks for the updates. Some nits

{code}
boolean useFairLock = conf.getBoolean("dfs.datanode.dataset.lock.fair", true);

blockOpLocksSize = conf.getInt("dfs.datanode.dataset.lock.size", 1024);
{code}
Please define configuration keys and default values in {{DFSConfigKeys}}.

Btw, What is your plan to co-operate this JIRA with HDFS-10804? 

+1 pending after addressing the comments. 

Thanks!


> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, 
> HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, 
> HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, 
> HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-2.patch, HDFS-9668-3.patch, 
> HDFS-9668-4.patch, HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, 
> HDFS-9668-8.patch, HDFS-9668-9.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 

[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertsions

2016-10-10 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Attachment: HDFS-10985.001.patch

The v1 patch also fixed the {{testGracefulFailover}} test case.

> o.a.h.ha.TestZKFailoverController should not use fixed time sleep before 
> assertsions
> 
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10985.000.patch, HDFS-10985.001.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.
> If this makes sense, we can address all other places in 
> {{TestZKFailoverController}}, including {{testGracefulFailover}} and 
> {{testDontFailoverToUnhealthyNode}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10993) rename may fail without a clear message indicating the failure reason.

2016-10-10 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge reassigned HDFS-10993:
-

Assignee: John Zhuge

> rename may fail without a clear message indicating the failure reason.
> --
>
> Key: HDFS-10993
> URL: https://issues.apache.org/jira/browse/HDFS-10993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: John Zhuge
>
> Currently the FSDirRenameOp$unprotectedRenameTo  looks like
> {code}
>  static INodesInPath unprotectedRenameTo(FSDirectory fsd,
>   final INodesInPath srcIIP, final INodesInPath dstIIP, long timestamp)
>   throws IOException {
> assert fsd.hasWriteLock();
> final INode srcInode = srcIIP.getLastINode();
> try {
>   validateRenameSource(fsd, srcIIP);
> } catch (SnapshotException e) {
>   throw e;
> } catch (IOException ignored) {
>   return null;
> }
> String src = srcIIP.getPath();
> String dst = dstIIP.getPath();
> // validate the destination
> if (dst.equals(src)) {
>   return dstIIP;
> }
> try {
>   validateDestination(src, dst, srcInode);
> } catch (IOException ignored) {
>   return null;
> }
> if (dstIIP.getLastINode() != null) {
>   NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
>   "failed to rename " + src + " to " + dst + " because destination " +
>   "exists");
>   return null;
> }
> INode dstParent = dstIIP.getINode(-2);
> if (dstParent == null) {
>   NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
>   "failed to rename " + src + " to " + dst + " because destination's 
> " +
>   "parent does not exist");
>   return null;
> }
> fsd.ezManager.checkMoveValidity(srcIIP, dstIIP, src);
> // Ensure dst has quota to accommodate rename
> verifyFsLimitsForRename(fsd, srcIIP, dstIIP);
> verifyQuotaForRename(fsd, srcIIP, dstIIP);
> RenameOperation tx = new RenameOperation(fsd, srcIIP, dstIIP);
> boolean added = false;
> INodesInPath renamedIIP = null;
> try {
>   // remove src
>   if (!tx.removeSrc4OldRename()) {
> return null;
>   }
>   renamedIIP = tx.addSourceToDestination();
>   added = (renamedIIP != null);
>   if (added) {
> if (NameNode.stateChangeLog.isDebugEnabled()) {
>   NameNode.stateChangeLog.debug("DIR* FSDirectory" +
>   ".unprotectedRenameTo: " + src + " is renamed to " + dst);
> }
> tx.updateMtimeAndLease(timestamp);
> tx.updateQuotasInSourceTree(fsd.getBlockStoragePolicySuite());
> return renamedIIP;
>   }
> } finally {
>   if (!added) {
> tx.restoreSource();
>   }
> }
> NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
> "failed to rename " + src + " to " + dst);
> return null;
>   }
> {code}
> There are several places that returns null without a clear message. Though 
> that seems to be on purpose in the code, it left to user to guess what's 
> going on.
> It seems to make sense to have a warning for each failed scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10993) rename may fail without a clear message indicating the failure reason.

2016-10-10 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-10993:


 Summary: rename may fail without a clear message indicating the 
failure reason.
 Key: HDFS-10993
 URL: https://issues.apache.org/jira/browse/HDFS-10993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Yongjun Zhang


Currently the FSDirRenameOp$unprotectedRenameTo  looks like
{code}
 static INodesInPath unprotectedRenameTo(FSDirectory fsd,
  final INodesInPath srcIIP, final INodesInPath dstIIP, long timestamp)
  throws IOException {
assert fsd.hasWriteLock();
final INode srcInode = srcIIP.getLastINode();
try {
  validateRenameSource(fsd, srcIIP);
} catch (SnapshotException e) {
  throw e;
} catch (IOException ignored) {
  return null;
}

String src = srcIIP.getPath();
String dst = dstIIP.getPath();
// validate the destination
if (dst.equals(src)) {
  return dstIIP;
}

try {
  validateDestination(src, dst, srcInode);
} catch (IOException ignored) {
  return null;
}

if (dstIIP.getLastINode() != null) {
  NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
  "failed to rename " + src + " to " + dst + " because destination " +
  "exists");
  return null;
}
INode dstParent = dstIIP.getINode(-2);
if (dstParent == null) {
  NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
  "failed to rename " + src + " to " + dst + " because destination's " +
  "parent does not exist");
  return null;
}

fsd.ezManager.checkMoveValidity(srcIIP, dstIIP, src);
// Ensure dst has quota to accommodate rename
verifyFsLimitsForRename(fsd, srcIIP, dstIIP);
verifyQuotaForRename(fsd, srcIIP, dstIIP);

RenameOperation tx = new RenameOperation(fsd, srcIIP, dstIIP);

boolean added = false;

INodesInPath renamedIIP = null;
try {
  // remove src
  if (!tx.removeSrc4OldRename()) {
return null;
  }

  renamedIIP = tx.addSourceToDestination();
  added = (renamedIIP != null);
  if (added) {
if (NameNode.stateChangeLog.isDebugEnabled()) {
  NameNode.stateChangeLog.debug("DIR* FSDirectory" +
  ".unprotectedRenameTo: " + src + " is renamed to " + dst);
}

tx.updateMtimeAndLease(timestamp);
tx.updateQuotasInSourceTree(fsd.getBlockStoragePolicySuite());

return renamedIIP;
  }
} finally {
  if (!added) {
tx.restoreSource();
  }
}
NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " +
"failed to rename " + src + " to " + dst);
return null;
  }
{code}

There are several places that returns null without a clear message. Though that 
seems to be on purpose in the code, it left to user to guess what's going on.

It seems to make sense to have a warning for each failed scenario.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10971) Distcp should not copy replication factor if source file is erasure coded

2016-10-10 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562891#comment-15562891
 ] 

Wei-Chiu Chuang commented on HDFS-10971:


I think it makes sense to me to add a CreateFlag for stripped files. Adding a 
new -p flag to preserve EC policy also makes sense to me.

> Distcp should not copy replication factor if source file is erasure coded
> -
>
> Key: HDFS-10971
> URL: https://issues.apache.org/jira/browse/HDFS-10971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: distcp
>Affects Versions: 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-10971.testcase.patch
>
>
> The current erasure coding implementation uses replication factor field to 
> store erasure coding policy.
> Distcp copies the source file's replication factor to the destination if 
> {{-pr}} is specified. However, if the source file is EC, the replication 
> factor (which is EC policy) should not be replicated to the destination file. 
> When a HdfsFileStatus is converted to FileStatus, the replication factor is 
> set to 0 if it's an EC file.
> In fact, I will attach a test case that shows trying to replicate the 
> replication factor of an EC file results in an IOException: "Requested 
> replication factor of 0 is less than the required minimum of 1 for 
> /tmp/dst/dest2"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10971) Distcp should not copy replication factor if source file is erasure coded

2016-10-10 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562859#comment-15562859
 ] 

Andrew Wang commented on HDFS-10971:


I wonder what the right distcp behavior is in this case of "-pr". It indicates 
that the user wants the dest file to be replicated if the src file is 
replicated, even if a dst parent directory has an EC policy set. I don't think 
we have a create API that supports that right now.

Combinations:

* no "-pr" specified: dst file is written with whatever is the default for that 
destination path, which could be EC or not
* "-pr" on a replicated file: dst is written replicated, with the same 
replication factor
* "-pr" on a striped file: Not sure, but I lean toward the same behavior as no 
"-pr" flag: dst file is written with whatever is the default for that 
destination path, which could be EC or not. We could then add a new "-p" flag 
to additionally preserve EC policy.

> Distcp should not copy replication factor if source file is erasure coded
> -
>
> Key: HDFS-10971
> URL: https://issues.apache.org/jira/browse/HDFS-10971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: distcp
>Affects Versions: 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-10971.testcase.patch
>
>
> The current erasure coding implementation uses replication factor field to 
> store erasure coding policy.
> Distcp copies the source file's replication factor to the destination if 
> {{-pr}} is specified. However, if the source file is EC, the replication 
> factor (which is EC policy) should not be replicated to the destination file. 
> When a HdfsFileStatus is converted to FileStatus, the replication factor is 
> set to 0 if it's an EC file.
> In fact, I will attach a test case that shows trying to replicate the 
> replication factor of an EC file results in an IOException: "Requested 
> replication factor of 0 is less than the required minimum of 1 for 
> /tmp/dst/dest2"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10987) Make Decommission less expensive when lot of blocks present.

2016-10-10 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562723#comment-15562723
 ] 

Brahma Reddy Battula commented on HDFS-10987:
-

Thanks [~kihwal] for taking look..

IIUC,[~daryn] was doing finegrained locking which is big change..?  what's your 
view on current patch..? 

JFYI,Tested the patch,NN was available(only first time it will take 15 to 25 
sec) while running the decommission and all the depended 
services(HBase,Spark,Hive..) are able to communicate.

> Make Decommission less expensive when lot of blocks present.
> 
>
> Key: HDFS-10987
> URL: https://issues.apache.org/jira/browse/HDFS-10987
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: HDFS-10987.patch
>
>
> When user want to decommission a node which having 50M blocks +,it could hold 
> the namesystem lock for long time.We've seen it is taking 36 sec+. 
> As we knew during this time, Namenode will not available... As this 
> decommission will continuosly run till all the blocks got replicated,hence 
> Namenode will unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10987) Make Decommission less expensive when lot of blocks present.

2016-10-10 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562642#comment-15562642
 ] 

Kihwal Lee commented on HDFS-10987:
---

We've seen this also. We don't have that many blocks per node, but still the 
lock time can be multiple seconds.  [~daryn] was going to do something similar, 
but he was also improving locking in replication monitor.  

> Make Decommission less expensive when lot of blocks present.
> 
>
> Key: HDFS-10987
> URL: https://issues.apache.org/jira/browse/HDFS-10987
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: HDFS-10987.patch
>
>
> When user want to decommission a node which having 50M blocks +,it could hold 
> the namesystem lock for long time.We've seen it is taking 36 sec+. 
> As we knew during this time, Namenode will not available... As this 
> decommission will continuosly run till all the blocks got replicated,hence 
> Namenode will unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10992) file is under construction but no leases found

2016-10-10 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562569#comment-15562569
 ] 

Rushabh S Shah commented on HDFS-10992:
---

Is this a dupe of HDFS-10763 ?

> file is under construction but no leases found
> --
>
> Key: HDFS-10992
> URL: https://issues.apache.org/jira/browse/HDFS-10992
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
> Environment: hortonworks 2.3 build 2557. 10 Datanodes , 2 NameNode in 
> auto failover
>Reporter: Chernishev Aleksandr
>
> On hdfs after recording a small number of files (at least 1000) the size 
> (150Mb - 1,6Gb) found 13 damaged files with incomplete last block.
> hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv 
> -openforwrite -files -blocks -locations
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8
> Connecting to namenode via 
> http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv
> FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path 
> /hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 
> MSK 2016
> /hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 
> block(s), OPENFORWRITE:  MISSING 1 blocks of total size 115289753 B
> 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 
> len=134217728 repl=4 
> [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]]
> 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 
> len=134217728 repl=3 
> [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]]
> 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 
> len=134217728 repl=3 
> [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
> 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 
> len=134217728 repl=3 
> [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]]
> 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 
> len=134217728 repl=4 
> [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
> 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 
> len=134217728 repl=3 
> [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK],
>  
> DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]]
> 6. 
> BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED,
>  truncateBlock=null, primaryNodeIndex=-1, 
> replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW],
>  
> ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW],
>  
> ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]}
>  len=115289753 MISSING!
> Status: CORRUPT
>  Total size:  920596121 B
>  Total dirs:  0
>  Total files: 1
>  Total symlinks:  0
>  Total blocks (validated):7 (avg. block size 131513731 B)
>   
>   UNDER MIN REPL'D BLOCKS:1 (14.285714 %)
>   dfs.namenode.replication.min:   1
>   CORRUPT FILES:  1
>   MISSING BLOCKS: 1
>   MISSING SIZE:   115289753 B
>   
>  Minimally replicated blocks: 6 (85.71429 %)
>  Over-replicated 

[jira] [Updated] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-10-10 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-10797:
-
Release Note: Disk usage summaries previously incorrectly counted files 
twice if they had been renamed since being snapshotted. Summaries now include 
current data plus snapshotted data that is no longer under in the directory 
either due to deletion or being moved outside of the directory.

Thanks [~xiaochen]. I added a release note...

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.8.0
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, 
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, 
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, 
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10992) file is under construction but no leases found

2016-10-10 Thread Chernishev Aleksandr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chernishev Aleksandr updated HDFS-10992:

Description: 
On hdfs after recording a small number of files (at least 1000) the size (150Mb 
- 1,6Gb) found 13 damaged files with incomplete last block.

hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv -openforwrite 
-files -blocks -locations
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8
Connecting to namenode via 
http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path 
/hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 
MSK 2016
/hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 
block(s), OPENFORWRITE:  MISSING 1 blocks of total size 115289753 B
0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]]
1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK],
 
DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK],
 
DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]]
2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]]
4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK],
 
DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]]
6. 
BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED,
 truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW],
 
ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW],
 
ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]}
 len=115289753 MISSING!

Status: CORRUPT
 Total size:920596121 B
 Total dirs:0
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  7 (avg. block size 131513731 B)
  
  UNDER MIN REPL'D BLOCKS:  1 (14.285714 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:1
  MISSING BLOCKS:   1
  MISSING SIZE: 115289753 B
  
 Minimally replicated blocks:   6 (85.71429 %)
 Over-replicated blocks:2 (28.571428 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:3
 Average block replication: 2.857143
 Corrupt blocks:0
 Missing replicas:  0 (0.0 %)
 Number of data-nodes:  10
 Number of racks:   1
FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds


The filesystem under path 
'/hadoop/files/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT

File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, 
datanode think that  block in RBW state. Recover not executed. The last 

[jira] [Updated] (HDFS-10992) file is under construction but no leases found

2016-10-10 Thread Chernishev Aleksandr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chernishev Aleksandr updated HDFS-10992:

Description: 
On hdfs after recording a small number of files (at least 1000) the size (150Mb 
- 1,6Gb) found 13 damaged files with incomplete last block.

hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv -openforwrite 
-files -blocks -locations
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8
Connecting to namenode via 
http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path 
/hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 
MSK 2016
/hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 
block(s), OPENFORWRITE:  MISSING 1 blocks of total size 115289753 B
0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]]
1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK],
 
DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK],
 
DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]]
2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]]
4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK],
 
DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]]
6. 
BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED,
 truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW],
 
ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW],
 
ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]}
 len=115289753 MISSING!

Status: CORRUPT
 Total size:920596121 B
 Total dirs:0
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  7 (avg. block size 131513731 B)
  
  UNDER MIN REPL'D BLOCKS:  1 (14.285714 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:1
  MISSING BLOCKS:   1
  MISSING SIZE: 115289753 B
  
 Minimally replicated blocks:   6 (85.71429 %)
 Over-replicated blocks:2 (28.571428 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:3
 Average block replication: 2.857143
 Corrupt blocks:0
 Missing replicas:  0 (0.0 %)
 Number of data-nodes:  10
 Number of racks:   1
FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds


The filesystem under path 
'/hadoop/files/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT

File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, 
datanode think that  block in RBW state. Recover not executed. The last 

[jira] [Updated] (HDFS-10992) file is under construction but no leases found

2016-10-10 Thread Chernishev Aleksandr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chernishev Aleksandr updated HDFS-10992:

Description: 
On hdfs after recording a small number of files (at least 1000) the size (150Mb 
- 1,6Gb) found 13 damaged files with incomplete last block.

hadoop fsck /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 
-openforwrite -files -blocks -locations
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8
Connecting to namenode via 
http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2F811-ITF-ZO-P-bad%2Fload_tarifer-zf-4_20160902165521521.csv
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path 
/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 
17:12:25 MSK 2016
/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 920596121 
bytes, 7 block(s), OPENFORWRITE:  MISSING 1 blocks of total size 115289753 B
0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]]
1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK],
 
DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK],
 
DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]]
2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]]
4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK],
 
DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]]
6. 
BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED,
 truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW],
 
ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW],
 
ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]}
 len=115289753 MISSING!

Status: CORRUPT
 Total size:920596121 B
 Total dirs:0
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  7 (avg. block size 131513731 B)
  
  UNDER MIN REPL'D BLOCKS:  1 (14.285714 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:1
  MISSING BLOCKS:   1
  MISSING SIZE: 115289753 B
  
 Minimally replicated blocks:   6 (85.71429 %)
 Over-replicated blocks:2 (28.571428 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:3
 Average block replication: 2.857143
 Corrupt blocks:0
 Missing replicas:  0 (0.0 %)
 Number of data-nodes:  10
 Number of racks:   1
FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds


The filesystem under path 
'/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT

File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, 
datanode think that  

[jira] [Updated] (HDFS-10992) file is under construction but no leases found

2016-10-10 Thread Chernishev Aleksandr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chernishev Aleksandr updated HDFS-10992:

Description: 
On hdfs after recording a small number of files (at least 1000) the size (150Mb 
- 1,6Gb) found 13 damaged files with incomplete last block.

hadoop fsck /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 
-openforwrite -files -blocks -locations
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8
Connecting to namenode via 
http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2F811-ITF-ZO-P-bad%2Fload_tarifer-zf-4_20160902165521521.csv
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path 
/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 
17:12:25 MSK 2016
/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 920596121 
bytes, 7 block(s), OPENFORWRITE:  MISSING 1 blocks of total size 115289753 B
0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]]
1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK],
 
DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK],
 
DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]]
2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK],
 
DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]]
4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 
repl=4 
[DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK],
 
DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK],
 
DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]]
5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 
repl=3 
[DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK],
 
DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK],
 
DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]]
6. 
BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED,
 truncateBlock=null, primaryNodeIndex=-1, 
replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW],
 
ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW],
 
ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]}
 len=115289753 MISSING!

Status: CORRUPT
 Total size:920596121 B
 Total dirs:0
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  7 (avg. block size 131513731 B)
  
  UNDER MIN REPL'D BLOCKS:  1 (14.285714 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:1
  MISSING BLOCKS:   1
  MISSING SIZE: 115289753 B
  
 Minimally replicated blocks:   6 (85.71429 %)
 Over-replicated blocks:2 (28.571428 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:3
 Average block replication: 2.857143
 Corrupt blocks:0
 Missing replicas:  0 (0.0 %)
 Number of data-nodes:  10
 Number of racks:   1
FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds


The filesystem under path 
'/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT

File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, 
datanode think that  

  1   2   >