[jira] [Updated] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-31 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9833:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~rakeshr] for the great contribution!

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309201#comment-15309201
 ] 

Hadoop QA commented on HDFS-10471:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 
new + 226 unchanged - 2 fixed = 228 total (was 228) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 14s {color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 0s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807304/HDFS-10471.002.patch |
| JIRA Issue | HDFS-10471 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 08db182fa162 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8ceb06e |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15618/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15618/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HDFS-Build/15618/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15618/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 

[jira] [Updated] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-31 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10471:
-
Attachment: HDFS-10471.002.patch

Sorry, the failed test {{TestHDFSCLI }} is related. Attach a new patch to fix 
this.

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch, HDFS-10471.002.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-31 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309068#comment-15309068
 ] 

Yiqun Lin edited comment on HDFS-10471 at 6/1/16 1:59 AM:
--

Sorry, the failed test {{TestHDFSCLI}} is related. Attach a new patch to fix 
this.


was (Author: linyiqun):
Sorry, the failed test {{TestHDFSCLI }} is related. Attach a new patch to fix 
this.

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch, HDFS-10471.002.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309065#comment-15309065
 ] 

Rakesh R commented on HDFS-9833:


Thanks a lot [~drankye] for the good support!

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Attachment: datanode_html.001.jpg

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Block pools info (BP IDs, namenode address, actor states)
> * Storage info (Volumes, capacity used, reserved, left)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Attachment: (was: datanode_html.001.jpg)

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_utilities.001.jpg, 
> dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Block pools info (BP IDs, namenode address, actor states)
> * Storage info (Volumes, capacity used, reserved, left)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Status: In Progress  (was: Patch Available)

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.0, 2.5.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Block pools info (BP IDs, namenode address, actor states)
> * Storage info (Volumes, capacity used, reserved, left)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Description: 
At present, datanode web UI doesn't have much information except for node name 
and port. Propose to add more information similar to namenode UI, including, 

* Static info (version, block pool  and cluster ID)
* Block pools info (BP IDs, namenode address, actor states)
* Storage info (Volumes, capacity used, reserved, left)
* Utilities (logs)

  was:
At present, datanode web UI doesn't have much information except for node name 
and port. Propose to add more information similar to namenode UI, including, 

* Static info (version, block pool  and cluster ID)
* Running state (active, decommissioning, decommissioned or lost etc)
* Summary (blocks, capacity, storage etc)
* Utilities (logs)


> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Block pools info (BP IDs, namenode address, actor states)
> * Storage info (Volumes, capacity used, reserved, left)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309023#comment-15309023
 ] 

Mingliang Liu commented on HDFS-10415:
--

Thank you [~cmccabe] for your review and commit!

> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky

2016-05-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309005#comment-15309005
 ] 

Hudson commented on HDFS-9466:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #9891 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9891/])
HDFS-9466. TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure 
(cmccabe: rev c7921c9bddb79c9db5059b6c3f7a3a586a3cd95b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java


> TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
> 
>
> Key: HDFS-9466
> URL: https://issues.apache.org/jira/browse/HDFS-9466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, hdfs-client
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch, 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache-output.txt
>
>
> This test is flaky and fails quite frequently in trunk.
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684)
> {noformat}
> Thanks to [~xiaochen] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308997#comment-15308997
 ] 

Colin Patrick McCabe commented on HDFS-10301:
-

bq. Vinitha's patch adds one RPC only in the case when block reports are sent 
in multiple RPCs.

The case where block reports are sent in multiple RPCs is exactly the case 
where scalability is the most important, since it indicates that we have a 
large number of blocks.  My patch adds no new RPCs.  If we are going to take an 
alternate approach, it should not involve a performance regression.

bq. Could you please review the patch.

I did review the patch.  I suggested adding an optional field in an existing 
RPC rather than adding a new RPC, and stated that I was -1 on adding new RPC 
load to the NN.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10458) getFileEncryptionInfo should return quickly for non-encrypted cluster

2016-05-31 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308994#comment-15308994
 ] 

Konstantin Shvachko commented on HDFS-10458:


The patch looks good.
I would add a similar condition for empty {{encryptionZones}} at the start of 
{{EncryptionZoneManager.getEncryptionZoneForPath()}}. This method is used many 
times both for write operations, like {{startFile()}} and for reads, like 
{{getFileInfo()}}. Even though this will still be under the lock, but returning 
and releasing the lock quickly should be beneficial.

> getFileEncryptionInfo should return quickly for non-encrypted cluster
> -
>
> Key: HDFS-10458
> URL: https://issues.apache.org/jira/browse/HDFS-10458
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.6.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-10458.00.patch
>
>
> {{FSDirectory#getFileEncryptionInfo}} always acquires {{readLock}} and checks 
> if the path belongs to an EZ. For a busy system with potentially many listing 
> operations, this could cause locking contention.
> I think we should add a call {{EncryptionZoneManager#hasEncryptionZone()}} to 
> return whether the system has any EZ. If no EZ at all, 
> {{getFileEncryptionInfo}} should return null without {{readLock}}.
> If {{hasEncryptionZone}} is only used in the above scenario, maybe itself 
> doesn't need a {{readLock}} -- if the system doesn't have any EZ when 
> {{getFileEncryptionInfo}} is called on a path, it means the path cannot be 
> encrypted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308980#comment-15308980
 ] 

Colin Patrick McCabe edited comment on HDFS-9466 at 6/1/16 12:52 AM:
-

Thanks for the explanation.  It sounds like the race condition is that the 
ShortCircuitRegistry on the DN needs to be informed about the client's decision 
that short-circuit is not working for the block, and this RPC takes time to 
arrive.  That background process races with completing the TCP read 
successfully and checking the number of slots in the unit test.

{code}
   public static interface Visitor {
-void accept(HashMap segments,
+boolean accept(HashMap segments,
 HashMultimap slots);
   }
{code}
I don't think it makes sense to change the return type of the visitor.  While 
you might find a boolean convenient, some other potential users of the 
interface might not find it useful.  Instead, just have your closure modify a 
{{final MutableBoolean}} declared nearby.

{code}
+}, 100, 1);
{code}
It seems like we could lower the latency here (perhaps check every 10 ms) and 
lengthen the timeout.  Since the test timeouts are generally 60s, I don't think 
it makes sense to make this timeout shorter than that.

+1 once that's addressed.  Thanks, [~jojochuang].  Sorry for the delay in 
reviews.


was (Author: cmccabe):
Thanks for the explanation.  It sounds like the race condition is that the 
ShortCircuitRegistry on the DN needs to be informed about the client's decision 
that short-circuit is not working for the block, and this RPC takes time to 
arrive.  That background process races with completing the TCP read 
successfully and checking the number of slots in the unit test.

{code}
   public static interface Visitor {
-void accept(HashMap segments,
+boolean accept(HashMap segments,
 HashMultimap slots);
   }
{code}
I don't think it makes sense to change the return type of the visitor.  While 
you might find a boolean convenient, some other potential users of the 
interface would have no use for it.  Instead, just have your closure modify a 
{{final MutableBoolean}} declared nearby.

{code}
+}, 100, 1);
{code}
No reason to make this shorter than the test limit, surely?

+1 once that's addressed.  Thanks, [~jojochuang].  Sorry for the delay in 
reviews.

> TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
> 
>
> Key: HDFS-9466
> URL: https://issues.apache.org/jira/browse/HDFS-9466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, hdfs-client
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch, 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache-output.txt
>
>
> This test is flaky and fails quite frequently in trunk.
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684)
> {noformat}
> Thanks to [~xiaochen] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308980#comment-15308980
 ] 

Colin Patrick McCabe commented on HDFS-9466:


Thanks for the explanation.  It sounds like the race condition is that the 
ShortCircuitRegistry on the DN needs to be informed about the client's decision 
that short-circuit is not working for the block, and this RPC takes time to 
arrive.  That background process races with completing the TCP read 
successfully and checking the number of slots in the unit test.

{code}
   public static interface Visitor {
-void accept(HashMap segments,
+boolean accept(HashMap segments,
 HashMultimap slots);
   }
{code}
I don't think it makes sense to change the return type of the visitor.  While 
you might find a boolean convenient, some other potential users of the 
interface would have no use for it.  Instead, just have your closure modify a 
{{final MutableBoolean}} declared nearby.

{code}
+}, 100, 1);
{code}
No reason to make this shorter than the test limit, surely?

+1 once that's addressed.  Thanks, [~jojochuang].  Sorry for the delay in 
reviews.

> TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
> 
>
> Key: HDFS-9466
> URL: https://issues.apache.org/jira/browse/HDFS-9466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, hdfs-client
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch, 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache-output.txt
>
>
> This test is flaky and fails quite frequently in trunk.
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684)
> {noformat}
> Thanks to [~xiaochen] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308964#comment-15308964
 ] 

Hudson commented on HDFS-10415:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #9890 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9890/])
HDFS-10415. TestDistributedFileSystem#MyDistributedFileSystem attempts 
(cmccabe: rev 29d6cadc52e411990c8237fd2fa71257cea60d9a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-10415:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to 2.8.

> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308928#comment-15308928
 ] 

Colin Patrick McCabe commented on HDFS-10415:
-

The subclass can change the configuration that gets passed to the superclass.

class SuperClass {
  SuperClass(Configuration conf) {
... initialize superclass part of the object ...
  }
}

class SubClass extends SuperClass {
  SubClass(Configuration conf) {
super(changeConf(conf));
... initialize my part of the object ...
  }

  private static Configuration changeConf(Configuration conf) {
Configuration nconf = new Configuration(conf);
nconf.set("foo", "bar");
return nconf;
  }
}

Having a separate init() method is a well-known antipattern.  Initialization 
belongs in the constructor.  The only time a separate init method is really 
necessary is if you're using a dialect of C++ that doesn't support exceptions.

> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308928#comment-15308928
 ] 

Colin Patrick McCabe edited comment on HDFS-10415 at 6/1/16 12:09 AM:
--

The subclass can change the configuration that gets passed to the superclass.

{code}
class SuperClass {
  SuperClass(Configuration conf) {
... initialize superclass part of the object ...
  }
}

class SubClass extends SuperClass {
  SubClass(Configuration conf) {
super(changeConf(conf));
... initialize my part of the object ...
  }

  private static Configuration changeConf(Configuration conf) {
Configuration nconf = new Configuration(conf);
nconf.set("foo", "bar");
return nconf;
  }
}
{code}

Having a separate init() method is a well-known antipattern.  Initialization 
belongs in the constructor.  The only time a separate init method is really 
necessary is if you're using a dialect of C++ that doesn't support exceptions.


was (Author: cmccabe):
The subclass can change the configuration that gets passed to the superclass.

class SuperClass {
  SuperClass(Configuration conf) {
... initialize superclass part of the object ...
  }
}

class SubClass extends SuperClass {
  SubClass(Configuration conf) {
super(changeConf(conf));
... initialize my part of the object ...
  }

  private static Configuration changeConf(Configuration conf) {
Configuration nconf = new Configuration(conf);
nconf.set("foo", "bar");
return nconf;
  }
}

Having a separate init() method is a well-known antipattern.  Initialization 
belongs in the constructor.  The only time a separate init method is really 
necessary is if you're using a dialect of C++ that doesn't support exceptions.

> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt

2016-05-31 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308913#comment-15308913
 ] 

Kuhu Shukla commented on HDFS-10342:


Thank you [~xiaobingo] . Sorry about the delay, I have been occupied with some 
non HDFS work lately. I will work on it later this week. Hope that works! Let 
me know if you have any comments on this. Thanks!

> BlockManager#createLocatedBlocks should not check corrupt replicas if none 
> are corrupt
> --
>
> Key: HDFS-10342
> URL: https://issues.apache.org/jira/browse/HDFS-10342
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>
> {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node 
> while populating the machines array.  There's no need to invoke the method if 
> {{corruptReplicas#numCorruptReplicas(block)}} returned 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10415) TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

2016-05-31 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-10415:

Summary: TestDistributedFileSystem#MyDistributedFileSystem attempts to set 
up statistics before initialize() is called  (was: 
TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2)

> TestDistributedFileSystem#MyDistributedFileSystem attempts to set up 
> statistics before initialize() is called
> -
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10342) BlockManager#createLocatedBlocks should not check corrupt replicas if none are corrupt

2016-05-31 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308898#comment-15308898
 ] 

Xiaobing Zhou commented on HDFS-10342:
--

[~kshukla] would you like to post a patch for it? Thanks.

> BlockManager#createLocatedBlocks should not check corrupt replicas if none 
> are corrupt
> --
>
> Key: HDFS-10342
> URL: https://issues.apache.org/jira/browse/HDFS-10342
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.7.0
>Reporter: Daryn Sharp
>Assignee: Kuhu Shukla
>
> {{corruptReplicas#isReplicaCorrupt(block, node)}} is called for every node 
> while populating the machines array.  There's no need to invoke the method if 
> {{corruptReplicas#numCorruptReplicas(block)}} returned 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10415) TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2

2016-05-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308894#comment-15308894
 ] 

Colin Patrick McCabe commented on HDFS-10415:
-

It sounds like there are no strong objections to HDFS-10415.000.patch and 
HDFS-10415-branch-2.001.patch  Let's fix this unit test!

We can improve this in a follow-on JIRA (personally, I like the idea of adding 
the initialization to the {{init}} method).  But it's not worth blocking the 
unit test fix.

+1.



> TestDistributedFileSystem#testDFSCloseOrdering() fails on branch-2
> --
>
> Key: HDFS-10415
> URL: https://issues.apache.org/jira/browse/HDFS-10415
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
> Environment: jenkins
>Reporter: Sangjin Lee
>Assignee: Mingliang Liu
> Attachments: HDFS-10415-branch-2.000.patch, 
> HDFS-10415-branch-2.001.patch, HDFS-10415.000.patch
>
>
> {noformat}
> Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
> testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.045 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
>   at 
> org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
>   at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
> {noformat}
> This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other 
> combinations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9476) TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail

2016-05-31 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308893#comment-15308893
 ] 

Xiaobing Zhou commented on HDFS-9476:
-

The new patch looks good, +1.

> TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail
> -
>
> Key: HDFS-9476
> URL: https://issues.apache.org/jira/browse/HDFS-9476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Akira AJISAKA
> Attachments: HDFS-9476.002.patch, HDFS-9476.01.patch
>
>
> This test occasionally fail. For example, the most recent one is:
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2587/
> Error Message
> {noformat}
> Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
> {noformat}
> Stacktrace
> {noformat}
> java.io.IOException: Cannot obtain block length for 
> LocatedBlock{BP-1371507683-67.195.81.153-1448798439809:blk_7162739548153522810_1020;
>  getBlockSize()=1024; corrupt=false; offset=0; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:33080,DS-c5eaf2b4-2ee6-419d-a8a0-44a5df5ef9a1,DISK]]}
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:399)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:343)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:275)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1046)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1011)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.dfsOpenFileWithRetries(TestDFSUpgradeFromImage.java:177)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyDir(TestDFSUpgradeFromImage.java:213)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.verifyFileSystem(TestDFSUpgradeFromImage.java:228)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.upgradeAndVerify(TestDFSUpgradeFromImage.java:600)
>   at 
> org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromRel1BBWImage(TestDFSUpgradeFromImage.java:622)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10465) libhdfs++: Implement GetBlockLocations

2016-05-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308891#comment-15308891
 ] 

Hadoop QA commented on HDFS-10465:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
37s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 58s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 2s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 23s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 16s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 33s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807277/HDFS-10465.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-10465 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux c55c1642173c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / f0ef898 |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15617/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15617/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: Implement GetBlockLocations
> --
>
> Key: HDFS-10465
> URL: https://issues.apache.org/jira/browse/HDFS-10465
> Project: Hadoop HDFS
>  Issue Type: 

[jira] [Updated] (HDFS-10433) Make retry also works well for Async DFS

2016-05-31 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-10433:
---
Issue Type: New Feature  (was: Sub-task)
Parent: (was: HDFS-9924)

> Make retry also works well for Async DFS
> 
>
> Key: HDFS-10433
> URL: https://issues.apache.org/jira/browse/HDFS-10433
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Xiaobing Zhou
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h10433_20160524.patch, h10433_20160525.patch, 
> h10433_20160525b.patch, h10433_20160527.patch, h10433_20160528.patch, 
> h10433_20160528c.patch
>
>
> In current Async DFS implementation, file system calls are invoked and 
> returns Future immediately to clients. Clients call Future#get to retrieve 
> final results. Future#get internally invokes a chain of callbacks residing in 
> ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The 
> callback path bypasses the original retry layer/logic designed for 
> synchronous DFS. This proposes refactoring to make retry also works for Async 
> DFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10433) Make retry also works well for Async DFS

2016-05-31 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308833#comment-15308833
 ] 

Tsz Wo Nicholas Sze commented on HDFS-10433:


Thanks Jing.  I will commit this shortly and file a follow up JIRA.

> Make retry also works well for Async DFS
> 
>
> Key: HDFS-10433
> URL: https://issues.apache.org/jira/browse/HDFS-10433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Xiaobing Zhou
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h10433_20160524.patch, h10433_20160525.patch, 
> h10433_20160525b.patch, h10433_20160527.patch, h10433_20160528.patch, 
> h10433_20160528c.patch
>
>
> In current Async DFS implementation, file system calls are invoked and 
> returns Future immediately to clients. Clients call Future#get to retrieve 
> final results. Future#get internally invokes a chain of callbacks residing in 
> ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The 
> callback path bypasses the original retry layer/logic designed for 
> synchronous DFS. This proposes refactoring to make retry also works for Async 
> DFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks

2016-05-31 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308830#comment-15308830
 ] 

Xiaobing Zhou commented on HDFS-10341:
--

[~ajisakaa] thank you for the work. Would you like to post a new patch to 
address [~arpitagarwal]'s comments?

> Add a metric to expose the timeout number of pending replication blocks
> ---
>
> Key: HDFS-10341
> URL: https://issues.apache.org/jira/browse/HDFS-10341
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-10341.01.patch, HDFS-10341.02.patch, 
> HDFS-10341.03.patch
>
>
> Per HDFS-6682, recording the timeout number of pending replication blocks is 
> useful to get the cluster health.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10465) libhdfs++: Implement GetBlockLocations

2016-05-31 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10465:
--
Assignee: Bob Hansen
  Status: Patch Available  (was: Open)

> libhdfs++: Implement GetBlockLocations
> --
>
> Key: HDFS-10465
> URL: https://issues.apache.org/jira/browse/HDFS-10465
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10465.HDFS-8707.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10465) libhdfs++: Implement GetBlockLocations

2016-05-31 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10465:
--
Attachment: HDFS-10465.HDFS-8707.000.patch

Introduces new function in hdfs_ext: hdfsGetBlockLocations

> libhdfs++: Implement GetBlockLocations
> --
>
> Key: HDFS-10465
> URL: https://issues.apache.org/jira/browse/HDFS-10465
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
> Attachments: HDFS-10465.HDFS-8707.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10464) libhdfs++: Implement GetPathInfo and ListDirectory

2016-05-31 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10464:
--
Assignee: Bob Hansen

> libhdfs++: Implement GetPathInfo and ListDirectory
> --
>
> Key: HDFS-10464
> URL: https://issues.apache.org/jira/browse/HDFS-10464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10464) libhdfs++: Implement GetPathInfo and ListDirectory

2016-05-31 Thread Anatoli Shein (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308786#comment-15308786
 ] 

Anatoli Shein commented on HDFS-10464:
--

I can fix this!

> libhdfs++: Implement GetPathInfo and ListDirectory
> --
>
> Key: HDFS-10464
> URL: https://issues.apache.org/jira/browse/HDFS-10464
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10433) Make retry also works well for Async DFS

2016-05-31 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10433:
-
Hadoop Flags: Reviewed

> Make retry also works well for Async DFS
> 
>
> Key: HDFS-10433
> URL: https://issues.apache.org/jira/browse/HDFS-10433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Xiaobing Zhou
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h10433_20160524.patch, h10433_20160525.patch, 
> h10433_20160525b.patch, h10433_20160527.patch, h10433_20160528.patch, 
> h10433_20160528c.patch
>
>
> In current Async DFS implementation, file system calls are invoked and 
> returns Future immediately to clients. Clients call Future#get to retrieve 
> final results. Future#get internally invokes a chain of callbacks residing in 
> ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The 
> callback path bypasses the original retry layer/logic designed for 
> synchronous DFS. This proposes refactoring to make retry also works for Async 
> DFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10433) Make retry also works well for Async DFS

2016-05-31 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308630#comment-15308630
 ] 

Jing Zhao commented on HDFS-10433:
--

Thanks for updating the patch, [~szetszwo]! The latest patch looks good to me 
overall. Two questions:
# The interval between two retries is realized by {{Thread.sleep}}, which makes 
the background thread in {{AsyncCallHandler}} sleep. Because all the client 
side {{Future.get}} calls need to wait until the background thread for the 
final result, this sleep may delay all the pending requests.
# The current background thread does a sleep inside of the loop, which may 
delay all the RPC requests. Ideally we want this thread to wait for response 
notification from RPC client.
# Minor:  {{Counters}} can be created inside of the Call's constructor method 
instead of being passed as a parameter.

Looks like #1 and #2 need some extra work. Considering the current patch is 
already complicated, we can address them in a separate jira. I will give +1 for 
committing the current patch first.


> Make retry also works well for Async DFS
> 
>
> Key: HDFS-10433
> URL: https://issues.apache.org/jira/browse/HDFS-10433
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Xiaobing Zhou
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h10433_20160524.patch, h10433_20160525.patch, 
> h10433_20160525b.patch, h10433_20160527.patch, h10433_20160528.patch, 
> h10433_20160528c.patch
>
>
> In current Async DFS implementation, file system calls are invoked and 
> returns Future immediately to clients. Clients call Future#get to retrieve 
> final results. Future#get internally invokes a chain of callbacks residing in 
> ClientNamenodeProtocolTranslatorPB, ProtobufRpcEngine and ipc.Client. The 
> callback path bypasses the original retry layer/logic designed for 
> synchronous DFS. This proposes refactoring to make retry also works for Async 
> DFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10466) DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation instead of BlockLocation

2016-05-31 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-10466:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Thanks [~andrew.wang] for discussion.
I just need a unique ID for DN storage.
With https://issues.apache.org/jira/browse/HDFS-8887, BlockLocation already 
contains those information. no need to add LocatedBlock.
Close it.

> DistributedFileSystem.listLocatedStatus() should return HdfsBlockLocation 
> instead of BlockLocation
> --
>
> Key: HDFS-10466
> URL: https://issues.apache.org/jira/browse/HDFS-10466
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Juan Yu
>Assignee: Juan Yu
>Priority: Minor
> Attachments: HDFS-10466.001.patch, HDFS-10466.patch
>
>
> https://issues.apache.org/jira/browse/HDFS-202 added a new API 
> listLocatedStatus() to get all files' status with block locations for a 
> directory. This is great that we don't need to call 
> FileSystem.getFileBlockLocations() for each file. it's much faster (about 
> 8-10 times).
> However, the returned LocatedFileStatus only contains basic BlockLocation 
> instead of HdfsBlockLocation, the LocatedBlock details are stripped out.
> It should do the similar as DFSClient.getBlockLocations(), return 
> HdfsBlockLocation which provide full block location details.
> The implementation of DistributedFileSystem. listLocatedStatus() retrieves 
> HdfsLocatedFileStatus which contains all information, but when convert it to 
> LocatedFileStatus, it doesn't keep LocatedBlock data. It's a simple (and 
> compatible) change to make to keep the LocatedBlock details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10341) Add a metric to expose the timeout number of pending replication blocks

2016-05-31 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308268#comment-15308268
 ] 

Arpit Agarwal commented on HDFS-10341:
--

Hi [~ajisakaa], that makes sense. So IIUC the metric may count the same block 
multiple times as timed out blocks are reinserted into the needed replications 
queue. If so perhaps we should rename the metric to 
{{NumTimedOutPendingReconstructions}} and update the documentation to state 
that it counts the number of timed out reconstructions and not the number of 
unique blocks that timed out?

+1 with those updates.

> Add a metric to expose the timeout number of pending replication blocks
> ---
>
> Key: HDFS-10341
> URL: https://issues.apache.org/jira/browse/HDFS-10341
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-10341.01.patch, HDFS-10341.02.patch, 
> HDFS-10341.03.patch
>
>
> Per HDFS-6682, recording the timeout number of pending replication blocks is 
> useful to get the cluster health.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10211) Add more info to DelegationTokenIdentifier#toString for better supportability

2016-05-31 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10211:
-
Description: 
Base class {{AbstractDelegationTokenIdentifier}} has the following 
implementation of {{toString()}} method
{code}
@Override
  public String toString() {
StringBuilder buffer = new StringBuilder();
buffer
.append("owner=" + owner + ", renewer=" + renewer + ", realUser="
+ realUser + ", issueDate=" + issueDate + ", maxDate=" + maxDate
+ ", sequenceNumber=" + sequenceNumber + ", masterKeyId="
+ masterKeyId);
return buffer.toString();
  }
{code}

However, derived class 
{{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}

has the following implementation that overrides the base class above:

{code}
  @Override
  public String toString() {
return getKind() + " token " + getSequenceNumber()
+ " for " + getUser().getShortUserName();
  }
{code}

And when exception is thrown because of token expiration or other reason (in 
{{AbstractDelegationTokenSecretManager#checkToken}}):
{code}
if (info.getRenewDate() < Time.now()) {
  throw new InvalidToken("token (" + identifier.toString() + ") is 
expired");
}
{code}
The exception doesn't show the detailed information about the token, like the 
base class' toString() method returns.

Creating this jira to change the 
{{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}
 implementation to include all the info about the token, as included by the 
base class.

This change would help supportability, at the expense of printing a little more 
information to the log. I hope no code really depends on the output string. 





  was:
Base class {{AbstractDelegationTokenIdentifier}} has the following 
implementation of {{toString()}} method
{code}
@Override
  public String toString() {
StringBuilder buffer = new StringBuilder();
buffer
.append("owner=" + owner + ", renewer=" + renewer + ", realUser="
+ realUser + ", issueDate=" + issueDate + ", maxDate=" + maxDate
+ ", sequenceNumber=" + sequenceNumber + ", masterKeyId="
+ masterKeyId);
return buffer.toString();
  }
{code}

However, derived class 
{{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}

has the following implementation that overrides the base class above:

{code}
  @Override
  public String toString() {
return getKind() + " token " + getSequenceNumber()
+ " for " + getUser().getShortUserName();
  }
{code}

And when exception is thrown because of token expiration or other reason:
{code}
if (info.getRenewDate() < Time.now()) {
  throw new InvalidToken("token (" + identifier.toString() + ") is 
expired");
}
{code}
The exception doesn't show the detailed information about the token, like the 
base class' toString() method returns.

Creating this jira to change the 
{{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}
 implementation to include all the info about the token, as included by the 
base class.

This change would help supportability, at the expense of printing a little more 
information to the log. I hope no code really depends on the output string. 






> Add more info to DelegationTokenIdentifier#toString for better supportability
> -
>
> Key: HDFS-10211
> URL: https://issues.apache.org/jira/browse/HDFS-10211
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> Base class {{AbstractDelegationTokenIdentifier}} has the following 
> implementation of {{toString()}} method
> {code}
> @Override
>   public String toString() {
> StringBuilder buffer = new StringBuilder();
> buffer
> .append("owner=" + owner + ", renewer=" + renewer + ", realUser="
> + realUser + ", issueDate=" + issueDate + ", maxDate=" + maxDate
> + ", sequenceNumber=" + sequenceNumber + ", masterKeyId="
> + masterKeyId);
> return buffer.toString();
>   }
> {code}
> However, derived class 
> {{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}
> has the following implementation that overrides the base class above:
> {code}
>   @Override
>   public String toString() {
> return getKind() + " token " + getSequenceNumber()
> + " for " + getUser().getShortUserName();
>   }
> {code}
> And when exception is thrown because of token expiration or other reason (in 
> {{AbstractDelegationTokenSecretManager#checkToken}}):
> {code}
> if (info.getRenewDate() < Time.now()) {
>   throw new InvalidToken("token (" + identifier.toString() + ") is 
> expired");
> }
> {code}
> The exception 

[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-05-31 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10441:
---
Attachment: HDFS-8707.HDFS-10441.001.patch

Rebased patch on top of SASL work.  I've been a little busier than expected so 
I haven't had a chance to address all of Bob's comments yet.  Really posted in 
case [~bobhansen] wants to check out the merge; otherwise the rest isn't worth 
looking at too closely yet. 

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10370) Allow DataNode to be started with numactl

2016-05-31 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308009#comment-15308009
 ] 

Dave Marion commented on HDFS-10370:


bq. Secure mode daemons do not have the necessary code here

 I made a small mention in the description about this; I'm not sure how it 
would work with jsvc.

Regarding some of the other points, I'm not familiar with the coding rules that 
are in place for the scripts. I don't believe I have the necessary karma to 
move this issue.

> Allow DataNode to be started with numactl
> -
>
> Key: HDFS-10370
> URL: https://issues.apache.org/jira/browse/HDFS-10370
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Dave Marion
>Assignee: Dave Marion
> Attachments: HDFS-10370-1.patch, HDFS-10370-2.patch, 
> HDFS-10370-3.patch, HDFS-10370-branch-2.004.patch, HDFS-10370.004.patch
>
>
> Allow numactl constraints to be applied to the datanode process. The 
> implementation I have in mind involves two environment variables (enable and 
> parameters) in the datanode startup process. Basically, if enabled and 
> numactl exists on the system, then start the java process using it. Provide a 
> default set of parameters, and allow the user to override the default. Wiring 
> this up for the non-jsvc use case seems straightforward. Not sure how this 
> can be supported using jsvc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8872) Reporting of missing blocks is different in fsck and namenode ui/metasave

2016-05-31 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307937#comment-15307937
 ] 

Rushabh S Shah commented on HDFS-8872:
--

[~mingma]: any thoughts ?


> Reporting of missing blocks is different in fsck and namenode ui/metasave
> -
>
> Key: HDFS-8872
> URL: https://issues.apache.org/jira/browse/HDFS-8872
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> Namenode ui and metasave will not report a block as missing if the only 
> replica is on decommissioning/decomissioned node while fsck will show it as 
> MISSING.
> Since decommissioned node can be formatted/removed anytime, we can actually 
> lose the block.
> Its better to alert on namenode ui if the only copy is on 
> decomissioned/decommissioning node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307905#comment-15307905
 ] 

Kihwal Lee edited comment on HDFS-10440 at 5/31/16 3:27 PM:


Thanks for the patch. I think it is showing the available information very 
well.  Having said that, we can take this opportunity to expose more on the 
block pool via jmx. The namenode addresses are useful, but showing the service 
actor state will be even better.  Sometimes datanodes have trouble talking to 
some namenodes, but not all. Verifying it usually involves looking at the log. 
Exposing individual BP service actor state through jmx and showing them through 
UI will be very helpful.

For the storage section, {{VolumeInfo}} in trunk/2.9/2.8 already contains 
{{reservedSpaceForReplicas}} (HDFS-6955) and {{numBlocks}} (HDFS-9425). Please 
verify (screenshot?) they appear on the web ui.


was (Author: kihwal):
Thanks for the patch. I think it is showing the available information very 
well.  Having said that, we can take this opportunity to expose more on the 
block pool via jmx. The namenode addressed are useful, but showing the service 
actor state will be even better.  Sometimes datanodes have trouble talking to 
some namenode, but not all. Verifying it usually involves looking at the log. 
Exposing individual BP service actor state through jmx and showing them through 
UI will be very helpful.

For the storage section, {{VolumeInfo}} in trunk/2.9/2.8 already contains 
{{reservedSpaceForReplicas}} (HDFS-6955) and {{numBlocks}} (HDFS-9425). Please 
verify (screenshot?) they appear on the web ui.

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9912) Make HDFS Federation docs up to date

2016-05-31 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-9912.
---
Resolution: Not A Bug

While dfsclusterhealth.jsp is removed, there is a patch to add it back: 
HDFS-8976.

Federation doc does not mention how to co-exist with HA, but the HA doc does 
mention the configuration needed to work with federation. So this jira is not 
needed.

> Make HDFS Federation docs up to date
> 
>
> Key: HDFS-9912
> URL: https://issues.apache.org/jira/browse/HDFS-9912
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>
> _HDFS Federation_ documentation has a few places that are out-dated:
> * dfsclusterhealth.jsp is already removed
> * It should mention how to configure Federation with High availability, 
> because the configuration appears incompatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10440:
--
Assignee: Weiwei Yang  (was: WEIWEI YANG)

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10440:
--
Assignee: WEIWEI YANG

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
>Assignee: WEIWEI YANG
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307905#comment-15307905
 ] 

Kihwal Lee commented on HDFS-10440:
---

Thanks for the patch. I think it is showing the available information very 
well.  Having said that, we can take this opportunity to expose more on the 
block pool via jmx. The namenode addressed are useful, but showing the service 
actor state will be even better.  Sometimes datanodes have trouble talking to 
some namenode, but not all. Verifying it usually involves looking at the log. 
Exposing individual BP service actor state through jmx and showing them through 
UI will be very helpful.

For the storage section, {{VolumeInfo}} in trunk/2.9/2.8 already contains 
{{reservedSpaceForReplicas}} (HDFS-6955) and {{numBlocks}} (HDFS-9425). Please 
verify (screenshot?) they appear on the web ui.

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10470) HDFS HA with kerberose Specified version of key is not available

2016-05-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10470:
--
Component/s: ha

> HDFS HA with kerberose Specified version of key is not available
> 
>
> Key: HDFS-10470
> URL: https://issues.apache.org/jira/browse/HDFS-10470
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: ha, journal-node
>Affects Versions: 2.6.0
> Environment: java version "1.7.0_79"
> Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
>Reporter: deng
>
> When I enable kerberose with HDFS HA, the journalnode always throw the below 
> exception,but the hdfs works well.
> 2016-05-30 10:54:37,877 WARN 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
> Authentication exception: GSSException: Failure unspecified at GSS-A
> PI level (Mechanism level: Specified version of key is not available (44))
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: Failure unspecified at GSS-API level (Mechanism level: 
> Specified version of key
> is not available (44))
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:517)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1279)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism 
> level: Specified version of key is not available (44))
> at 
> sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> sun.security.jgss.spnego.SpNegoContext.GSS_acceptSecContext(SpNegoContext.java:875)
> at 
> sun.security.jgss.spnego.SpNegoContext.acceptSecContext(SpNegoContext.java:548)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
>  at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:366)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:348)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:348)
> ... 

[jira] [Updated] (HDFS-9908) Datanode should tolerate disk scan failure during NN handshake

2016-05-31 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9908:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

This patch no longer applies after the DU refactoring.

> Datanode should tolerate disk scan failure during NN handshake
> --
>
> Key: HDFS-9908
> URL: https://issues.apache.org/jira/browse/HDFS-9908
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.0
> Environment: CDH5.3.3
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9908.001.patch, HDFS-9908.002.patch, 
> HDFS-9908.003.patch, HDFS-9908.004.patch, HDFS-9908.005.patch, 
> HDFS-9908.006.patch, HDFS-9908.007.patch
>
>
> DN may treat a disk scan failure exception as an NN handshake exception, and 
> this can prevent a DN to join a cluster even if most of its disks are healthy.
> During NN handshake, DN initializes block pools. It will create a lock files 
> per disk, and then scan the volumes. However, if the scanning throws 
> exceptions due to disk failure, DN will think it's an exception because NN is 
> inconsistent with the local storage (see {{DataNode#initBlockPool}}. As a 
> result, it will attempt to reconnect to NN again.
> However, at this point, DN has not deleted its lock files on the disks. If it 
> reconnects to NN again, it will think the same disks are already being used, 
> and then it will fail handshake again because all disks can not be used (due 
> to locking), and repeatedly. This will happen even if the DN has multiple 
> disks, and only one of them fails. The DN will not be able to connect to NN 
> despite just one failing disk. Note that it is possible to successfully 
> create a lock file on a disk, and then has error scanning the disk.
> We saw this on a CDH 5.3.3 cluster (which is based on Apache Hadoop 2.5.0, 
> and we still see the same bug in 3.0.0 trunk branch). The root cause is that 
> DN treats an internal error (single disk failure) as an external one (NN 
> handshake failure) and we should fix it.
> {code:title=DataNode.java}
> /**
>* One of the Block Pools has successfully connected to its NN.
>* This initializes the local storage for that block pool,
>* checks consistency of the NN's cluster ID, etc.
>* 
>* If this is the first block pool to register, this also initializes
>* the datanode-scoped storage.
>* 
>* @param bpos Block pool offer service
>* @throws IOException if the NN is inconsistent with the local storage.
>*/
>   void initBlockPool(BPOfferService bpos) throws IOException {
> NamespaceInfo nsInfo = bpos.getNamespaceInfo();
> if (nsInfo == null) {
>   throw new IOException("NamespaceInfo not found: Block pool " + bpos
>   + " should have retrieved namespace info before initBlockPool.");
> }
> 
> setClusterId(nsInfo.clusterID, nsInfo.getBlockPoolID());
> // Register the new block pool with the BP manager.
> blockPoolManager.addBlockPool(bpos);
> 
> // In the case that this is the first block pool to connect, initialize
> // the dataset, block scanners, etc.
> initStorage(nsInfo);
> // Exclude failed disks before initializing the block pools to avoid 
> startup
> // failures.
> checkDiskError();
> data.addBlockPool(nsInfo.getBlockPoolID(), conf);  <- this line 
> throws disk error exception
> blockScanner.enableBlockPoolId(bpos.getBlockPoolId());
> initDirectoryScanner(conf);
>   }
> {code}
> {{FsVolumeList#addBlockPool}} is the source of exception.
> {code:title=FsVolumeList.java}
>   void addBlockPool(final String bpid, final Configuration conf) throws 
> IOException {
> long totalStartTime = Time.monotonicNow();
> 
> final List exceptions = Collections.synchronizedList(
> new ArrayList());
> List blockPoolAddingThreads = new ArrayList();
> for (final FsVolumeImpl v : volumes) {
>   Thread t = new Thread() {
> public void run() {
>   try (FsVolumeReference ref = v.obtainReference()) {
> FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
> " on volume " + v + "...");
> long startTime = Time.monotonicNow();
> v.addBlockPool(bpid, conf);
> long timeTaken = Time.monotonicNow() - startTime;
> FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
> " on " + v + ": " + timeTaken + "ms");
>   } catch (ClosedChannelException e) {
> // ignore.
>   } catch (IOException ioe) {
> FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
> ". Will throw later.", ioe);
> 

[jira] [Updated] (HDFS-10470) HDFS HA with kerberose Specified version of key is not available

2016-05-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10470:
--
Issue Type: Bug  (was: New Feature)

> HDFS HA with kerberose Specified version of key is not available
> 
>
> Key: HDFS-10470
> URL: https://issues.apache.org/jira/browse/HDFS-10470
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node
>Affects Versions: 2.6.0
> Environment: java version "1.7.0_79"
> Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
>Reporter: deng
>
> When I enable kerberose with HDFS HA, the journalnode always throw the below 
> exception,but the hdfs works well.
> 2016-05-30 10:54:37,877 WARN 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
> Authentication exception: GSSException: Failure unspecified at GSS-A
> PI level (Mechanism level: Specified version of key is not available (44))
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: Failure unspecified at GSS-API level (Mechanism level: 
> Specified version of key
> is not available (44))
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:517)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1279)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism 
> level: Specified version of key is not available (44))
> at 
> sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> sun.security.jgss.spnego.SpNegoContext.GSS_acceptSecContext(SpNegoContext.java:875)
> at 
> sun.security.jgss.spnego.SpNegoContext.acceptSecContext(SpNegoContext.java:548)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
>  at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:366)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:348)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:348)
>  

[jira] [Updated] (HDFS-10470) HDFS HA with kerberose Specified version of key is not available

2016-05-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10470:
--
Component/s: (was: datanode)
 journal-node

> HDFS HA with kerberose Specified version of key is not available
> 
>
> Key: HDFS-10470
> URL: https://issues.apache.org/jira/browse/HDFS-10470
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: ha, journal-node
>Affects Versions: 2.6.0
> Environment: java version "1.7.0_79"
> Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
>Reporter: deng
>
> When I enable kerberose with HDFS HA, the journalnode always throw the below 
> exception,but the hdfs works well.
> 2016-05-30 10:54:37,877 WARN 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
> Authentication exception: GSSException: Failure unspecified at GSS-A
> PI level (Mechanism level: Specified version of key is not available (44))
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: Failure unspecified at GSS-API level (Mechanism level: 
> Specified version of key
> is not available (44))
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:517)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1279)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
> at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism 
> level: Specified version of key is not available (44))
> at 
> sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> sun.security.jgss.spnego.SpNegoContext.GSS_acceptSecContext(SpNegoContext.java:875)
> at 
> sun.security.jgss.spnego.SpNegoContext.acceptSecContext(SpNegoContext.java:548)
> at 
> sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
>  at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:366)
> at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:348)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> 

[jira] [Commented] (HDFS-10471) DFSAdmin#SetQuotaCommand's help msg is not correct

2016-05-31 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307887#comment-15307887
 ] 

Rushabh S Shah commented on HDFS-10471:
---

+1 ltgm (non-binding)

> DFSAdmin#SetQuotaCommand's help msg is not correct
> --
>
> Key: HDFS-10471
> URL: https://issues.apache.org/jira/browse/HDFS-10471
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-10471.001.patch
>
>
> The help message of the command that related with SetQuota is not show 
> correct. In message, the name {{quota}} was showed as {{N}}. The {{N}} was 
> not appeared before.
> {noformat}
> -setQuota  ...: Set the quota  for each 
> directory .
>   The directory quota is a long integer that puts a hard limit
>   on the number of names in the directory tree
>   For each directory, attempt to set the quota. An error will be 
> reported if
>   1. N is not a positive integer, or
>   2. User is not an administrator, or
>   3. The directory does not exist or is a file.
>   Note: A quota of 1 would force the directory to remain empty.
> {noformat}
> The command {{-setSpaceQuota}} also has similar problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10367) TestDFSShell.testMoveWithTargetPortEmpty fails with Address bind exception.

2016-05-31 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-10367:

Attachment: HDFS-10367-005.patch

Uploaded the patch to address the above comment..

> TestDFSShell.testMoveWithTargetPortEmpty fails with Address bind exception.
> ---
>
> Key: HDFS-10367
> URL: https://issues.apache.org/jira/browse/HDFS-10367
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-10367-002.patch, HDFS-10367-003.patch, 
> HDFS-10367-004.patch, HDFS-10367-005.patch, HDFS-10367.patch
>
>
> {noformat}
> Problem binding to [localhost:9820] java.net.BindException: Address already 
> in use; For more details see:  http://wiki.apache.org/hadoop/BindException
> Stack Trace:
> java.net.BindException: Problem binding to [localhost:9820] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:444)
>   at sun.nio.ch.Net.bind(Net.java:436)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:530)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:793)
>   at org.apache.hadoop.ipc.Server.(Server.java:2592)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:563)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:538)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:426)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:783)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:924)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:903)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1620)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.TestDFSShell.testMoveWithTargetPortEmpty(TestDFSShell.java:567)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-31 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307519#comment-15307519
 ] 

Kai Zheng commented on HDFS-9833:
-

The latest patch LGTM and +1. Will commit it tomorrow.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307427#comment-15307427
 ] 

Rakesh R commented on HDFS-9833:


Test case failures {{TestRollingUpgrade.testRollback}} and 
{{TestEditLog.testBatchedSyncWithClosedLogs}} are not related to my patch, pls 
ignore it.

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -
>
> Key: HDFS-9833
> URL: https://issues.apache.org/jira/browse/HDFS-9833
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, 
> HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

2016-05-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307410#comment-15307410
 ] 

Hadoop QA commented on HDFS-9833:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 58s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s 
{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 56s {color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 104m 12s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807038/HDFS-9833-08.patch |
| JIRA Issue | HDFS-9833 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 69937bcccfc9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 93d8a7f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15612/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HDFS-Build/15612/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307405#comment-15307405
 ] 

Hadoop QA commented on HDFS-10440:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 0m 53s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807072/HDFS-10440.001.patch |
| JIRA Issue | HDFS-10440 |
| Optional Tests |  asflicense  |
| uname | Linux 6cce37dd588b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 93d8a7f |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/15614/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Attachment: HDFS-10440.001.patch

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Status: Patch Available  (was: Open)

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0, 2.6.0, 2.5.0
>Reporter: Weiwei Yang
> Attachments: HDFS-10440.001.patch, datanode_html.001.jpg, 
> datanode_utilities.001.jpg, dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307402#comment-15307402
 ] 

Weiwei Yang commented on HDFS-10440:


I have a patch ready to add datanode UI with basic information, including block 
pools and storage. Please check [#datanode_html.001.jpg] and 
[#datanode_utilities.001.jpg]. The patch can be applied to both trunk and 
branch-2. This patch is created based on existing datanode JMX, I think it's a 
good place to start.

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: datanode_html.001.jpg, datanode_utilities.001.jpg, 
> dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Attachment: datanode_utilities.001.jpg
datanode_html.001.jpg

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: datanode_html.001.jpg, datanode_utilities.001.jpg, 
> dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10440) Improve DataNode web UI

2016-05-31 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-10440:
---
Attachment: (was: dn_UI_logs.jpg)

> Improve DataNode web UI
> ---
>
> Key: HDFS-10440
> URL: https://issues.apache.org/jira/browse/HDFS-10440
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Weiwei Yang
> Attachments: datanode_html.001.jpg, datanode_utilities.001.jpg, 
> dn_web_ui_mockup.jpg
>
>
> At present, datanode web UI doesn't have much information except for node 
> name and port. Propose to add more information similar to namenode UI, 
> including, 
> * Static info (version, block pool  and cluster ID)
> * Running state (active, decommissioning, decommissioned or lost etc)
> * Summary (blocks, capacity, storage etc)
> * Utilities (logs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307335#comment-15307335
 ] 

Vinayakumar B edited comment on HDFS-10472 at 5/31/16 7:24 AM:
---

Hi [~chenfolin], good find.
1. As there is no special logic handled for IOE, You can replace existing {{ 
catch (IOException ex) {}} itself with {{ catch (Throwable ex) {}}. 
Also note that, Hadoop uses 2 spaces as indentation, instead of 4.


was (Author: vinayrpet):
Hi [~chenfolin], good find.
1. As there is no special login handled for IOE, You can replace existing {{ 
catch (IOException ex) {}} itself with {{ catch (Throwable ex) {}}. 
Also note that, Hadoop uses 2 spaces as indentation, instead of 4.

> NameNode Rpc Reader Thread crash, and cluster hang.
> ---
>
> Key: HDFS-10472
> URL: https://issues.apache.org/jira/browse/HDFS-10472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.5.0, 2.6.0, 2.8.0, 2.7.2, 2.6.2, 2.6.4
>Reporter: ChenFolin
>  Labels: patch
> Attachments: HDFS-10472.patch
>
>
> My Cluster hang yesterday .
> Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
> include datanode hearbeat &.
> We can see , the method doRunLoop just catch InterruptedException and 
> IOException:
> while (running) {
>   SelectionKey key = null;
>   try {
> // consume as many connections as currently queued to avoid
> // unbridled acceptance of connections that starves the select
> int size = pendingConnections.size();
> for (int i=size; i>0; i--) {
>   Connection conn = pendingConnections.take();
>   conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
> }
> readSelector.select();
> Iterator iter = 
> readSelector.selectedKeys().iterator();
> while (iter.hasNext()) {
>   key = iter.next();
>   iter.remove();
>   if (key.isValid()) {
> if (key.isReadable()) {
>   doRead(key);
> }
>   }
>   key = null;
> }
>   } catch (InterruptedException e) {
> if (running) {  // unexpected -- log it
>   LOG.info(Thread.currentThread().getName() + " unexpectedly 
> interrupted", e);
> }
>   } catch (IOException ex) {
> LOG.error("Error in Reader", ex);
>   } 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307335#comment-15307335
 ] 

Vinayakumar B edited comment on HDFS-10472 at 5/31/16 7:24 AM:
---

Hi [~chenfolin], good find.
1. As there is no special logic handled for IOE, You can replace existing 
{{catch (IOException ex) {}} itself with {{catch (Throwable ex) {}}. 
Also note that, Hadoop uses 2 spaces as indentation, instead of 4.


was (Author: vinayrpet):
Hi [~chenfolin], good find.
1. As there is no special logic handled for IOE, You can replace existing {{ 
catch (IOException ex) {}} itself with {{ catch (Throwable ex) {}}. 
Also note that, Hadoop uses 2 spaces as indentation, instead of 4.

> NameNode Rpc Reader Thread crash, and cluster hang.
> ---
>
> Key: HDFS-10472
> URL: https://issues.apache.org/jira/browse/HDFS-10472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.5.0, 2.6.0, 2.8.0, 2.7.2, 2.6.2, 2.6.4
>Reporter: ChenFolin
>  Labels: patch
> Attachments: HDFS-10472.patch
>
>
> My Cluster hang yesterday .
> Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
> include datanode hearbeat &.
> We can see , the method doRunLoop just catch InterruptedException and 
> IOException:
> while (running) {
>   SelectionKey key = null;
>   try {
> // consume as many connections as currently queued to avoid
> // unbridled acceptance of connections that starves the select
> int size = pendingConnections.size();
> for (int i=size; i>0; i--) {
>   Connection conn = pendingConnections.take();
>   conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
> }
> readSelector.select();
> Iterator iter = 
> readSelector.selectedKeys().iterator();
> while (iter.hasNext()) {
>   key = iter.next();
>   iter.remove();
>   if (key.isValid()) {
> if (key.isReadable()) {
>   doRead(key);
> }
>   }
>   key = null;
> }
>   } catch (InterruptedException e) {
> if (running) {  // unexpected -- log it
>   LOG.info(Thread.currentThread().getName() + " unexpectedly 
> interrupted", e);
> }
>   } catch (IOException ex) {
> LOG.error("Error in Reader", ex);
>   } 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307335#comment-15307335
 ] 

Vinayakumar B commented on HDFS-10472:
--

Hi [~chenfolin], good find.
1. As there is no special login handled for IOE, You can replace existing {{ 
catch (IOException ex) {}} itself with {{ catch (Throwable ex) {}}. 
Also note that, Hadoop uses 2 spaces as indentation, instead of 4.

> NameNode Rpc Reader Thread crash, and cluster hang.
> ---
>
> Key: HDFS-10472
> URL: https://issues.apache.org/jira/browse/HDFS-10472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.5.0, 2.6.0, 2.8.0, 2.7.2, 2.6.2, 2.6.4
>Reporter: ChenFolin
>  Labels: patch
> Attachments: HDFS-10472.patch
>
>
> My Cluster hang yesterday .
> Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
> include datanode hearbeat &.
> We can see , the method doRunLoop just catch InterruptedException and 
> IOException:
> while (running) {
>   SelectionKey key = null;
>   try {
> // consume as many connections as currently queued to avoid
> // unbridled acceptance of connections that starves the select
> int size = pendingConnections.size();
> for (int i=size; i>0; i--) {
>   Connection conn = pendingConnections.take();
>   conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
> }
> readSelector.select();
> Iterator iter = 
> readSelector.selectedKeys().iterator();
> while (iter.hasNext()) {
>   key = iter.next();
>   iter.remove();
>   if (key.isValid()) {
> if (key.isReadable()) {
>   doRead(key);
> }
>   }
>   key = null;
> }
>   } catch (InterruptedException e) {
> if (running) {  // unexpected -- log it
>   LOG.info(Thread.currentThread().getName() + " unexpectedly 
> interrupted", e);
> }
>   } catch (IOException ex) {
> LOG.error("Error in Reader", ex);
>   } 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread ChenFolin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenFolin updated HDFS-10472:
-
Attachment: HDFS-10472.patch

add catch throwable

> NameNode Rpc Reader Thread crash, and cluster hang.
> ---
>
> Key: HDFS-10472
> URL: https://issues.apache.org/jira/browse/HDFS-10472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.5.0, 2.6.0, 2.8.0, 2.7.2, 2.6.2, 2.6.4
>Reporter: ChenFolin
>  Labels: patch
> Attachments: HDFS-10472.patch
>
>
> My Cluster hang yesterday .
> Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
> include datanode hearbeat &.
> We can see , the method doRunLoop just catch InterruptedException and 
> IOException:
> while (running) {
>   SelectionKey key = null;
>   try {
> // consume as many connections as currently queued to avoid
> // unbridled acceptance of connections that starves the select
> int size = pendingConnections.size();
> for (int i=size; i>0; i--) {
>   Connection conn = pendingConnections.take();
>   conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
> }
> readSelector.select();
> Iterator iter = 
> readSelector.selectedKeys().iterator();
> while (iter.hasNext()) {
>   key = iter.next();
>   iter.remove();
>   if (key.isValid()) {
> if (key.isReadable()) {
>   doRead(key);
> }
>   }
>   key = null;
> }
>   } catch (InterruptedException e) {
> if (running) {  // unexpected -- log it
>   LOG.info(Thread.currentThread().getName() + " unexpectedly 
> interrupted", e);
> }
>   } catch (IOException ex) {
> LOG.error("Error in Reader", ex);
>   } 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread ChenFolin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenFolin updated HDFS-10472:
-
  Labels: patch  (was: )
Release Note: catch throwable
  Status: Patch Available  (was: Open)

add catch throwable

> NameNode Rpc Reader Thread crash, and cluster hang.
> ---
>
> Key: HDFS-10472
> URL: https://issues.apache.org/jira/browse/HDFS-10472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.6.4, 2.6.2, 2.7.2, 2.6.0, 2.5.0, 2.8.0
>Reporter: ChenFolin
>  Labels: patch
>
> My Cluster hang yesterday .
> Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
> include datanode hearbeat &.
> We can see , the method doRunLoop just catch InterruptedException and 
> IOException:
> while (running) {
>   SelectionKey key = null;
>   try {
> // consume as many connections as currently queued to avoid
> // unbridled acceptance of connections that starves the select
> int size = pendingConnections.size();
> for (int i=size; i>0; i--) {
>   Connection conn = pendingConnections.take();
>   conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
> }
> readSelector.select();
> Iterator iter = 
> readSelector.selectedKeys().iterator();
> while (iter.hasNext()) {
>   key = iter.next();
>   iter.remove();
>   if (key.isValid()) {
> if (key.isReadable()) {
>   doRead(key);
> }
>   }
>   key = null;
> }
>   } catch (InterruptedException e) {
> if (running) {  // unexpected -- log it
>   LOG.info(Thread.currentThread().getName() + " unexpectedly 
> interrupted", e);
> }
>   } catch (IOException ex) {
> LOG.error("Error in Reader", ex);
>   } 
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10472) NameNode Rpc Reader Thread crash, and cluster hang.

2016-05-31 Thread ChenFolin (JIRA)
ChenFolin created HDFS-10472:


 Summary: NameNode Rpc Reader Thread crash, and cluster hang.
 Key: HDFS-10472
 URL: https://issues.apache.org/jira/browse/HDFS-10472
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Affects Versions: 2.6.4, 2.6.2, 2.7.2, 2.6.0, 2.5.0, 2.8.0
Reporter: ChenFolin


My Cluster hang yesterday .
Becuase the rpc server Reader threads crash. So all rpc request  timeout, 
include datanode hearbeat &.
We can see , the method doRunLoop just catch InterruptedException and 
IOException:

while (running) {
  SelectionKey key = null;
  try {
// consume as many connections as currently queued to avoid
// unbridled acceptance of connections that starves the select
int size = pendingConnections.size();
for (int i=size; i>0; i--) {
  Connection conn = pendingConnections.take();
  conn.channel.register(readSelector, SelectionKey.OP_READ, conn);
}
readSelector.select();

Iterator iter = 
readSelector.selectedKeys().iterator();
while (iter.hasNext()) {
  key = iter.next();
  iter.remove();
  if (key.isValid()) {
if (key.isReadable()) {
  doRead(key);
}
  }
  key = null;
}
  } catch (InterruptedException e) {
if (running) {  // unexpected -- log it
  LOG.info(Thread.currentThread().getName() + " unexpectedly 
interrupted", e);
}
  } catch (IOException ex) {
LOG.error("Error in Reader", ex);
  } 
}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org