[jira] [Commented] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-27 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975896#comment-14975896
 ] 

Rakesh R commented on HDFS-9261:


Thanks [~umamaheswararao] for the advice. 
{{TestDFSStripedOutputStreamWithFailure#testMultipleDatanodeFailure56}} -> 
dnIndexSuite 6, 7, 8, is covering all parity datanodes failures.

I've done minor updates. Attached new patch including debugging message. Please 
review it again!

> Erasure Coding: Skip encoding the data cells if all the parity data streamers 
> are failed for the current block group
> 
>
> Key: HDFS-9261
> URL: https://issues.apache.org/jira/browse/HDFS-9261
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9261-00.patch, HDFS-9261-01.patch
>
>
> {{DFSStripedOutputStream}} will continue writing with minimum number 
> (dataBlockNum) of live datanodes. It won't replace the failed datanodes 
> immediately for the current block group. Consider a case where all the parity 
> data streamers are failed, now it is unnecessary to encode the data block 
> cells and generate the parity data. This is a corner case where it can skip 
> {{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8410) Add computation time metrics to datanode for ECWorker

2015-10-27 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8410:

Attachment: HDFS-8410-002.patch

> Add computation time metrics to datanode for ECWorker
> -
>
> Key: HDFS-8410
> URL: https://issues.apache.org/jira/browse/HDFS-8410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8410-001.patch, HDFS-8410-002.patch
>
>
> This is a sub task of HDFS-7674. It adds time metric for ec decode work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-27 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9261:
---
Attachment: HDFS-9261-01.patch

> Erasure Coding: Skip encoding the data cells if all the parity data streamers 
> are failed for the current block group
> 
>
> Key: HDFS-9261
> URL: https://issues.apache.org/jira/browse/HDFS-9261
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9261-00.patch, HDFS-9261-01.patch
>
>
> {{DFSStripedOutputStream}} will continue writing with minimum number 
> (dataBlockNum) of live datanodes. It won't replace the failed datanodes 
> immediately for the current block group. Consider a case where all the parity 
> data streamers are failed, now it is unnecessary to encode the data block 
> cells and generate the parity data. This is a corner case where it can skip 
> {{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8410) Add computation time metrics to datanode for ECWorker

2015-10-27 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8410:

Summary: Add computation time metrics to datanode for ECWorker  (was: Add 
time count metrics to datanode for ECWorker)

> Add computation time metrics to datanode for ECWorker
> -
>
> Key: HDFS-8410
> URL: https://issues.apache.org/jira/browse/HDFS-8410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8410-001.patch
>
>
> This is a sub task of HDFS-7674. It adds time metric for ec decode work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975911#comment-14975911
 ] 

Hadoop QA commented on HDFS-9261:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  1s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 27s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m  6s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | |  46m  4s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768670/HDFS-9261-00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13216/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13216/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13216/console |


This message was automatically generated.

> Erasure Coding: Skip encoding the data cells if all the parity data streamers 
> are failed for the current block group
> 
>
> Key: HDFS-9261
> URL: https://issues.apache.org/jira/browse/HDFS-9261
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9261-00.patch, HDFS-9261-01.patch
>
>
> {{DFSStripedOutputStream}} will continue writing with minimum number 
> (dataBlockNum) of live datanodes. It won't replace the failed datanodes 
> immediately for the current block group. Consider a case where all the parity 
> data streamers are failed, now it is unnecessary to encode the data block 
> cells and generate the parity data. This is a corner case where it can skip 
> {{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8410) Add computation time metrics to datanode for ECWorker

2015-10-27 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976021#comment-14976021
 ] 

Li Bo commented on HDFS-8410:
-

Patch 002 reduces metrics number grom 3 to 2. The time metrics make user know 
how much time spent in encoding/decoding work for every datanode.

> Add computation time metrics to datanode for ECWorker
> -
>
> Key: HDFS-8410
> URL: https://issues.apache.org/jira/browse/HDFS-8410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8410-001.patch, HDFS-8410-002.patch
>
>
> This is a sub task of HDFS-7674. It adds time metric for ec decode work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975804#comment-14975804
 ] 

Hadoop QA commented on HDFS-9260:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 25s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 19 new or modified test files. |
| {color:green}+1{color} | javac |   8m  2s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 25s | The applied patch generated  9 
new checkstyle issues (total was 882, now 878). |
| {color:green}+1{color} | whitespace |   0m 36s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 36s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  53m 59s | Tests failed in hadoop-hdfs. |
| | | 103m 27s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.namenode.ha.TestHASafeMode |
|   | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768859/HDFS-9260.008.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13214/console |


This message was automatically generated.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975815#comment-14975815
 ] 

Hadoop QA commented on HDFS-9313:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  25m  4s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  10m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  14m  7s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 31s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  3s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  5s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 44s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 34s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 37s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  67m 32s | Tests failed in hadoop-hdfs. |
| | | 130m  8s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestRead |
|   | hadoop.hdfs.server.namenode.TestStorageRestore |
|   | hadoop.hdfs.server.namenode.TestParallelImageWrite |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.security.token.block.TestBlockToken |
|   | hadoop.hdfs.server.namenode.TestFSImage |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker |
|   | hadoop.hdfs.server.namenode.TestXAttrConfigFlag |
|   | hadoop.hdfs.security.TestDelegationToken |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768868/HDFS-9313.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13212/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13212/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13212/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13212/console |


This message was automatically generated.

> Possible NullPointerException in BlockManager if no excess replica can be 
> chosen
> 
>
> Key: HDFS-9313
> URL: https://issues.apache.org/jira/browse/HDFS-9313
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9313.patch
>
>
> HDFS-8647 makes it easier to reason about various block placement scenarios. 
> Here is one possible case where BlockManager won't be able to find the excess 
> replica to delete: when storage policy changes around the same time balancer 
> moves the block. When this happens, it will cause NullPointerException.
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978)
> {noformat}
> Note that it isn't found in any production clusters. Instead, it is found 
> from new unit tests. In addition, the issue has been there before HDFS-8647.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7896) HDFS Slow disk detection

2015-10-27 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975816#comment-14975816
 ] 

Brahma Reddy Battula commented on HDFS-7896:


A slow disk checker script should be well designed. 

Firstly, it should take storage types into consideration.And should we measure 
read/write throughout, or iops? 

Cache/free memory affects the results. 
What if the disk is currently in heavy load?  
What's more, a script may not run in all environment. 

Anyway, provide a default script is not a good idea.Re-invent the wheel also 
not a good idea.

There exists some benchmark tools
http://askubuntu.com/questions/87035/how-to-check-hard-disk-performance

The tools give you a result number. But what's the threshold of a "slow" disk?

What I'm thinking is, we don't write the script. The script is not used for 
running benchmark. Instead, We get the result by script from somewhere, the 
result must be prepared by some other daemon in advance. 

Running benchmark at startup could spend much time, although we can print the 
interactive feedback. But I prefer to detect the slow disk periodically. Some 
other daemon can periodically refresh the benchmark results, and feed HDFS the 
results. The daemon can run benchmark on some disk if the disk is light load. 
Some other information like bad sector numbers can be checked more often.


how do you think?

> HDFS Slow disk detection
> 
>
> Key: HDFS-7896
> URL: https://issues.apache.org/jira/browse/HDFS-7896
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Arpit Agarwal
> Attachments: HDFS-7896.00.patch
>
>
> HDFS should detect slow disks. To start with we can flag this information via 
> the NameNode web UI. Alternatively DNs can avoid using slow disks for writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9305) Delayed heartbeat processing causes storm of subsequent heartbeats

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975860#comment-14975860
 ] 

Hudson commented on HDFS-9305:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/])
HDFS-9305. Delayed heartbeat processing causes storm of subsequent (arp: rev 
d8736eb9ca351b82854601ea3b1fbc3c9fab44e4)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBpServiceActorScheduler.java


> Delayed heartbeat processing causes storm of subsequent heartbeats
> --
>
> Key: HDFS-9305
> URL: https://issues.apache.org/jira/browse/HDFS-9305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Arpit Agarwal
> Attachments: HDFS-9305.01.patch, HDFS-9305.02.patch
>
>
> A DataNode typically sends a heartbeat to the NameNode every 3 seconds.  We 
> expect heartbeat handling to complete relatively quickly.  However, if 
> something unexpected causes heartbeat processing to get blocked, such as a 
> long GC or heavy lock contention within the NameNode, then heartbeat 
> processing would be delayed.  After recovering from this delay, the DataNode 
> then starts sending a storm of heartbeat messages in a tight loop.  In a 
> large cluster with many DataNodes, this storm of heartbeat messages could 
> cause harmful load on the NameNode and make overall cluster recovery more 
> difficult.
> The bug appears to be caused by incorrect timekeeping inside 
> {{BPServiceActor}}.  The next heartbeat time is always calculated as a delta 
> from the previous heartbeat time, without any compensation for possible long 
> latency on an individual heartbeat RPC.  The only mitigation would be 
> restarting all DataNodes to force a reset of the heartbeat schedule, or 
> simply wait out the storm until the scheduling catches up and corrects itself.
> This problem would not manifest after a NameNode restart.  In that case, the 
> NameNode would respond to the first heartbeat by telling the DataNode to 
> re-register, and {{BPServiceActor#reRegister}} would reset the heartbeat 
> schedule to the current time.  I believe the problem would only manifest if 
> the NameNode process kept alive, but processed heartbeats unexpectedly slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9291) Fix TestInterDatanodeProtocol to be FsDataset-agnostic.

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975855#comment-14975855
 ] 

Hudson commented on HDFS-9291:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/])
HDFS-9291. Fix TestInterDatanodeProtocol to be FsDataset-agnostic. (lei) (lei: 
rev 37bf6141f10d6f4be138c965ea08032420b01f56)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix TestInterDatanodeProtocol to be FsDataset-agnostic.
> ---
>
> Key: HDFS-9291
> URL: https://issues.apache.org/jira/browse/HDFS-9291
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9291.00.patch
>
>
> {{TestInterDatanodeProtocol}} assumes the fsdataset is {{FsDatasetImpl}}. 
> This JIRA will make it dataset agnostic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8945) Update the description about replica placement in HDFS Architecture documentation

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975861#comment-14975861
 ] 

Hudson commented on HDFS-8945:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/])
HDFS-8945. Update the description about replica placement in HDFS (wang: rev 
e8aefdf08bc79a0ad537c1b7a1dc288aabd399b9)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update the description about replica placement in HDFS Architecture 
> documentation
> -
>
> Key: HDFS-8945
> URL: https://issues.apache.org/jira/browse/HDFS-8945
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8945.001.patch, HDFS-8945.002.patch
>
>
> The description about replica placement should have
> * Explanation about storage types and storage policies should be added
> * placement policy for replication factor greater than 4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9292) Make TestFileConcorruption independent to underlying FsDataset Implementation.

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975854#comment-14975854
 ] 

Hudson commented on HDFS-9292:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/])
HDFS-9292. Make TestFileConcorruption independent to underlying (lei: rev 
399ad009158cbc6aca179396d390fe770801420f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make TestFileConcorruption independent to underlying FsDataset Implementation.
> --
>
> Key: HDFS-9292
> URL: https://issues.apache.org/jira/browse/HDFS-9292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9292.00.patch
>
>
> {{TestFileCorruption}} manipulates the block data by directly accessing the 
> block files on disk.  {{MiniDFSCluster}} has already offered ways to corrupt 
> data. We can use that to make {{TestFileCorruption}} agnostic to the 
> implementation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9284) fsck command should not print exception trace when file not found

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975863#comment-14975863
 ] 

Hudson commented on HDFS-9284:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/])
HDFS-9284. fsck command should not print exception trace when file not (wang: 
rev 677a936bf759515ac94d9accb9bf5364f688d051)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java


> fsck command should not print exception trace when file not found 
> --
>
> Key: HDFS-9284
> URL: https://issues.apache.org/jira/browse/HDFS-9284
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Fix For: 2.8.0
>
> Attachments: HDFS-9284_00.patch, HDFS-9284_01.patch, 
> HDFS-9284_02.patch
>
>
> when file doesnt exist fsck throws exception 
> {code}
> ./hdfs fsck /kiran
> {code}
> the following exception occurs 
> {code}
> WARN util.NativeCodeLoader: Unable to load native-hadoop library for your 
> platform... using builtin-java classes where applicable
> FileSystem is inaccessible due to:
> java.io.FileNotFoundException: File does not exist: /kiran
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1273)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1265)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1265)
> at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:755)
> at org.apache.hadoop.hdfs.tools.DFSck.getResolvedPath(DFSck.java:236)
> at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:316)
> at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:73)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:155)
> at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:152)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667)
> at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:151)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:383)
> {code}
> but only {code } File does not exist: /kiran {code} error message should be 
> thrown
> {code} } catch (IOException ioe) {
> System.err.println("FileSystem is inaccessible due to:\n"
> + StringUtils.stringifyException(ioe));
> }{code}
> i think it should use ioe.getmessage() method
> {code}
> } catch (IOException ioe) {
> System.err.println("FileSystem is inaccessible due to:\n"
> + StringUtils.stringifyException(ioe.getmessage()));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8256) "-storagepolicies , -blockId ,-replicaDetails " options are missed out in usage and from documentation

2015-10-27 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975925#comment-14975925
 ] 

Akira AJISAKA commented on HDFS-8256:
-

fsck -blockId (HDFS-6663) and -storagepolicies (HDFS-7467) options are in 
2.7.0, so we need to document the options in branch-2.7.

> "-storagepolicies , -blockId ,-replicaDetails " options are missed out in 
> usage and from documentation
> --
>
> Key: HDFS-8256
> URL: https://issues.apache.org/jira/browse/HDFS-8256
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: J.Andreina
>Assignee: J.Andreina
> Fix For: 2.8.0
>
> Attachments: HDFS-8256.2.patch, HDFS-8256.3.patch, 
> HDFS-8256.4-branch-2.patch, HDFS-8256.4.patch, HDFS-8256_Trunk.1.patch
>
>
> "-storagepolicies , -blockId ,-replicaDetails " options are missed out in 
> usage and from documentation.
> {noformat}
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks [-includeSnapshots] 
> [-showprogress]
> {noformat}
> Found as part of HDFS-8108.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975837#comment-14975837
 ] 

Walter Su commented on HDFS-9289:
-

The patch hides a potential bigger bug. We should find it out and address it.
Hi, [~lichangleo]. I'll very appreciate if you could enable debug level of 
{{NameNode.blockStateChangeLog}} and attach more logs? Or instructions about 
how to reproduce it.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9168) Move client side unit test to hadoop-hdfs-client

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975843#comment-14975843
 ] 

Hadoop QA commented on HDFS-9168:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 16s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 13 new or modified test files. |
| {color:green}+1{color} | javac |   8m 22s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m  3s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 59s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 13s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 30s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  51m  6s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 57s | Tests passed in 
hadoop-hdfs-client. |
| | | 106m 18s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestFsck |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestEditLog |
|   | org.apache.hadoop.hdfs.server.namenode.TestFileTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768855/HDFS-9168.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13215/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13215/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13215/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13215/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13215/console |


This message was automatically generated.

> Move client side unit test to hadoop-hdfs-client
> 
>
> Key: HDFS-9168
> URL: https://issues.apache.org/jira/browse/HDFS-9168
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9168.000.patch, HDFS-9168.001.patch, 
> HDFS-9168.002.patch, HDFS-9168.003.patch
>
>
> We need to identify and move the unit tests on the client of hdfs to the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976025#comment-14976025
 ] 

Hadoop QA commented on HDFS-9276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 16s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 16s | The applied patch generated  3 
new checkstyle issues (total was 28, now 31). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m  0s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   7m 29s | Tests passed in 
hadoop-common. |
| | |  50m 13s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/1276/HDFS-9276.04.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13218/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13218/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13218/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13218/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13218/console |


This message was automatically generated.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, 

[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-27 Thread Liangliang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangliang Gu updated HDFS-9276:

Attachment: HDFS-9276.05.patch

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, debug1.PNG, 
> debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
>   at 
> 

[jira] [Commented] (HDFS-9315) Update excess storage type list properly when del hint is picked

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976138#comment-14976138
 ] 

Hadoop QA commented on HDFS-9315:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  31m 26s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  11m 15s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  14m  8s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 34s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 58s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  4s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 49s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  64m  3s | Tests failed in hadoop-hdfs. |
| | | 134m  6s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferKeepalive |
|   | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestDFSClientRetries |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768879/HDFS-9315.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13217/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13217/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13217/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13217/console |


This message was automatically generated.

> Update excess storage type list properly when del hint is picked
> 
>
> Key: HDFS-9315
> URL: https://issues.apache.org/jira/browse/HDFS-9315
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9315.patch
>
>
> HDFS-8647 makes it easier to reason about various block placement scenarios. 
> Here is one potential issue where {{excessTypes}} isn't updated when 
> {{delNodeHint}} is picked. When {{delNodeHint}} isn't picked, the excess 
> storage identified will be removed from excessTypes.
> {noformat}
>   if (useDelHint(firstOne, delNodeHintStorage, addedNodeStorage,
>   moreThanOne, tmpExcessTypes)) {
> cur = delNodeHintStorage;
>   } else { // regular excessive replica removal
> cur = chooseReplicaToDelete((short) expectedNumOfReplicas, 
> moreThanOne,
> exactlyOne, tmpExcessTypes);
>   }
> chooseReplicaToDelete(...) {
> ...
> excessTypes.remove(storage.getStorageType());
> }
> {noformat}
> It isn't clear how this can happen in real world; maybe HDFS-9314. Usually 
> when del hint is used, the delta between expected num replica and the actual 
> replica is one and thus this shouldn't cause any issue.
> Still it is better to make it consistent. Each time an excess replica is 
> picked, excessTypes should be updated regardless whether it comes from del 
> hint or not.
> Note this issue was there prior to HDFS-8647.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HDFS-9317:
---

 Summary: Document fsck -blockId and -storagepolicy options in 
branch-2.7
 Key: HDFS-9317
 URL: https://issues.apache.org/jira/browse/HDFS-9317
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.7.1
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA


fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
HDFS-7467 but the options are not documented in 2.7.1 release.
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8108) Fsck should provide the info on mandatory option to be used along with "-blocks , -locations and -racks"

2015-10-27 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-8108:

Component/s: documentation

> Fsck should provide the info on mandatory option to be used along with 
> "-blocks , -locations and -racks"
> 
>
> Key: HDFS-8108
> URL: https://issues.apache.org/jira/browse/HDFS-8108
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-8108.1.patch, HDFS-8108.2.patch
>
>
> Fsck usage information should provide the information on  which options are 
> mandatory to be  passed along with "-blocks , -locations and -racks" to be in 
> sync with documentation.
> For example :
> To get information on:
> 1.  Blocks (-blocks),  option  "-files" should also be used.
> 2.  Rack information (-racks),  option  "-files" and "-blocks" should also be 
> used.
> {noformat}
> ./hdfs fsck -files -blocks
> ./hdfs fsck -files -blocks -racks
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-9317:

Attachment: HDFS-9317-branch-2.7.00.patch

Attaching a patch for branch-2.7.

> Document fsck -blockId and -storagepolicy options in branch-2.7
> ---
>
> Key: HDFS-9317
> URL: https://issues.apache.org/jira/browse/HDFS-9317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-9317-branch-2.7.00.patch
>
>
> fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
> HDFS-7467 but the options are not documented in 2.7.1 release.
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-9317:

Status: Patch Available  (was: Open)

> Document fsck -blockId and -storagepolicy options in branch-2.7
> ---
>
> Key: HDFS-9317
> URL: https://issues.apache.org/jira/browse/HDFS-9317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-9317-branch-2.7.00.patch
>
>
> fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
> HDFS-7467 but the options are not documented in 2.7.1 release.
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8529) Add blocks count metrics to datanode for ECWorker

2015-10-27 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8529:

Attachment: HDFS-8529-002.patch

> Add blocks count metrics to datanode for ECWorker
> -
>
> Key: HDFS-8529
> URL: https://issues.apache.org/jira/browse/HDFS-8529
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8529-001.patch, HDFS-8529-002.patch
>
>
> This sub task will add block count metrics to datanode that takes the 
> encoding and recovery tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9316) BytesRead counter not incremented in WebhdfsFileSystem

2015-10-27 Thread Shradha Revankar (JIRA)
Shradha Revankar created HDFS-9316:
--

 Summary: BytesRead counter not incremented in WebhdfsFileSystem
 Key: HDFS-9316
 URL: https://issues.apache.org/jira/browse/HDFS-9316
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Shradha Revankar
Assignee: Shradha Revankar
Priority: Minor


When WebhdfsFileSystem is used bytesRead counter in FileSystem.statistics 
returns 0.

{{statistics.incrementBytesRead();}}

has to be called in the read() methods.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976206#comment-14976206
 ] 

Akira AJISAKA commented on HDFS-9317:
-

These options are documented in trunk and branch-2 by HDFS-8256, however, we 
cannot simply backport the patch because HDFS-8256 includes fsck 
-replicaDetails option, which is not implemented in branch-2.7.

> Document fsck -blockId and -storagepolicy options in branch-2.7
> ---
>
> Key: HDFS-9317
> URL: https://issues.apache.org/jira/browse/HDFS-9317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>
> fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
> HDFS-7467 but the options are not documented in 2.7.1 release.
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976115#comment-14976115
 ] 

Hadoop QA commented on HDFS-9261:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  25m 24s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |  12m 25s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  16m 30s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 35s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 24s | The applied patch generated  1 
new checkstyle issues (total was 8, now 9). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 12s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 53s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   5m 16s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 48s | Tests passed in 
hadoop-hdfs-client. |
| | |  69m 59s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768919/HDFS-9261-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13220/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13220/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13220/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13220/console |


This message was automatically generated.

> Erasure Coding: Skip encoding the data cells if all the parity data streamers 
> are failed for the current block group
> 
>
> Key: HDFS-9261
> URL: https://issues.apache.org/jira/browse/HDFS-9261
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9261-00.patch, HDFS-9261-01.patch
>
>
> {{DFSStripedOutputStream}} will continue writing with minimum number 
> (dataBlockNum) of live datanodes. It won't replace the failed datanodes 
> immediately for the current block group. Consider a case where all the parity 
> data streamers are failed, now it is unnecessary to encode the data block 
> cells and generate the parity data. This is a corner case where it can skip 
> {{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976176#comment-14976176
 ] 

Hadoop QA commented on HDFS-9129:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  30m 48s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |  10m 22s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  13m 30s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 30s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 51s | The applied patch generated  7 
new checkstyle issues (total was 804, now 755). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 58s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 46s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 20s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  79m 15s | Tests failed in hadoop-hdfs. |
| | | 146m 41s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestEditLogRace |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768896/HDFS-9129.011.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13219/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13219/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13219/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13219/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13219/console |


This message was automatically generated.

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch, HDFS-9129.006.patch, HDFS-9129.007.patch, 
> HDFS-9129.008.patch, HDFS-9129.009.patch, HDFS-9129.010.patch, 
> HDFS-9129.011.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {{BlockManager}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9309) Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig()

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975787#comment-14975787
 ] 

Hadoop QA commented on HDFS-9309:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   9m 58s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 42s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   1m 37s | Tests passed in 
hadoop-kms. |
| {color:green}+1{color} | yarn tests |   3m  9s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:red}-1{color} | hdfs tests |  49m 56s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   3m 44s | Tests passed in 
hadoop-hdfs-httpfs. |
| | |  86m 33s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.namenode.TestSecureNameNode |
|   | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768851/HDFS-9309.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-kms test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/artifact/patchprocess/testrun_hadoop-kms.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-httpfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/artifact/patchprocess/testrun_hadoop-hdfs-httpfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13213/console |


This message was automatically generated.

> Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig()
> -
>
> Key: HDFS-9309
> URL: https://issues.apache.org/jira/browse/HDFS-9309
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HDFS-9309.001.patch
>
>
> When KeyStoreUtil.setupSSLConfig() is called, several files are created 
> (ssl-server.xml, ssl-client.xml, trustKS.jks, clientKS.jks, serverKS.jks). 
> However, if they are not deleted upon exit, weird thing can happen to any 
> subsequent tests.
> For example, if ssl-client.xml is not delete, but trustKS.jks is deleted, 
> TestWebHDFSOAuth2.listStatusReturnsAsExpected will fail with message:
> {noformat}
> java.io.IOException: Unable to load OAuth2 connection factory.
>   at java.io.FileInputStream.open(Native Method)
>   at java.io.FileInputStream.(FileInputStream.java:146)
>   at 
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:164)
>   at 
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:81)
>   at 
> org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:215)
>   at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131)
>   at 
> 

[jira] [Commented] (HDFS-9311) Support optional offload of NameNode HA service health checks to a separate RPC server.

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975788#comment-14975788
 ] 

Hadoop QA commented on HDFS-9311:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m  8s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   9m 17s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 47s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 29s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 56s | The applied patch generated  1 
new checkstyle issues (total was 12, now 12). |
| {color:red}-1{color} | whitespace |   0m  6s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 53s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 43s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   9m 44s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  68m 49s | Tests failed in hadoop-hdfs. |
| | | 129m 55s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.ha.TestZKFailoverControllerStress |
|   | hadoop.ha.TestZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.util.TestByteArrayManager |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
| Timed out tests | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
|   | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol 
|
|   | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles |
|   | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768861/HDFS-9311.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13210/console |


This message was automatically generated.

> Support optional offload of NameNode HA service health checks to a separate 
> RPC server.
> ---
>
> Key: HDFS-9311
> URL: https://issues.apache.org/jira/browse/HDFS-9311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9311.001.patch
>
>
> When a NameNode is overwhelmed with load, it can lead to resource exhaustion 
> of the RPC handler pools (both client-facing and service-facing).  
> Eventually, this blocks the health check RPC issued from ZKFC, which triggers 
> a failover.  Depending on fencing configuration, the former active NameNode 
> may be killed.  In an overloaded situation, the new active NameNode is likely 
> to suffer the same fate, because client load patterns don't change after the 
> failover.  This can degenerate into flapping between the 2 NameNodes without 
> real recovery.  If a NameNode had been killed by fencing, then it would have 
> to transition through safe mode, further delaying time to recovery.
> This issue proposes a separate, optional RPC server 

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976266#comment-14976266
 ] 

Hadoop QA commented on HDFS-9276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  23m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  12m 50s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  18m 22s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 44s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 54s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 58s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 31s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  10m 55s | Tests failed in 
hadoop-common. |
| | |  75m 18s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
| Failed unit tests | hadoop.fs.TestLocalFsFCStatistics |
|   | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.ipc.TestRPCWaitForProxy |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768947/HDFS-9276.05.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 96677be |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13221/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13221/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13221/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13221/console |


This message was automatically generated.

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, debug1.PNG, 
> debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   

[jira] [Updated] (HDFS-9261) Erasure Coding: Skip encoding the data cells if all the parity data streamers are failed for the current block group

2015-10-27 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9261:
---
Attachment: HDFS-9261-02.patch

Attached new patch fixing checkstyle issue.

> Erasure Coding: Skip encoding the data cells if all the parity data streamers 
> are failed for the current block group
> 
>
> Key: HDFS-9261
> URL: https://issues.apache.org/jira/browse/HDFS-9261
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9261-00.patch, HDFS-9261-01.patch, 
> HDFS-9261-02.patch
>
>
> {{DFSStripedOutputStream}} will continue writing with minimum number 
> (dataBlockNum) of live datanodes. It won't replace the failed datanodes 
> immediately for the current block group. Consider a case where all the parity 
> data streamers are failed, now it is unnecessary to encode the data block 
> cells and generate the parity data. This is a corner case where it can skip 
> {{writeParityCells()}} step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976222#comment-14976222
 ] 

Brahma Reddy Battula commented on HDFS-9317:


[~ajisakaa] thanks for reporting and uploading the patch..Patch LGTM,+1 
(non-binding).

> Document fsck -blockId and -storagepolicy options in branch-2.7
> ---
>
> Key: HDFS-9317
> URL: https://issues.apache.org/jira/browse/HDFS-9317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-9317-branch-2.7.00.patch
>
>
> fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
> HDFS-7467 but the options are not documented in 2.7.1 release.
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9317) Document fsck -blockId and -storagepolicy options in branch-2.7

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976232#comment-14976232
 ] 

Hadoop QA commented on HDFS-9317:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768953/HDFS-9317-branch-2.7.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 96677be |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13222/console |


This message was automatically generated.

> Document fsck -blockId and -storagepolicy options in branch-2.7
> ---
>
> Key: HDFS-9317
> URL: https://issues.apache.org/jira/browse/HDFS-9317
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
> Attachments: HDFS-9317-branch-2.7.00.patch
>
>
> fsck -blockId and -storagepolicy options are implemented by HDFS-6663 and 
> HDFS-7467 but the options are not documented in 2.7.1 release.
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#fsck



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9275) Fix TestRecoverStripedFile

2015-10-27 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976233#comment-14976233
 ] 

Yi Liu commented on HDFS-9275:
--

Walter, sorry I forgot this JIRA :-) 

For continuous block, if n replicas are missed (for total 3 replicas, at most 2 
can be missed, so n <3), we will check the total of replicas in 
PendingReplicationBlocks to see whether we need to schedule new block 
replication.
For block reconstruction of striped block, ideally we should follow this,  for 
any missed striped internal block, we just need to reconstruct 1, so we should 
check whether there is 1 in pendingReplicationBlocks,  but currently we track 
the block group in the list.  Then it becomes we compare the total missed 
striped internal blocks with the number in PendingReplicationBlocks, if there 
are more than two missed striped internal blocks and one is reconstructed 
first, then there may be some unnecessary reconstruction.   I think we can do a 
simple improvement for striped block, if there is one in 
PendingReplicationBlocks, then we don't schedule new reconstruction work 
instead of comparing the number of missed striped internal blocks. 

> Fix TestRecoverStripedFile
> --
>
> Key: HDFS-9275
> URL: https://issues.apache.org/jira/browse/HDFS-9275
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9275.01.patch, HDFS-9275.02.patch, 
> HDFS-9275.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9275) Fix TestRecoverStripedFile

2015-10-27 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976289#comment-14976289
 ] 

Kai Zheng commented on HDFS-9275:
-

Thanks Walter for the JIRA and working on this. Would you give the description 
for what's the exact issue or cause this issue is trying to address? It may 
help others understand the issue in short. Thanks.

> Fix TestRecoverStripedFile
> --
>
> Key: HDFS-9275
> URL: https://issues.apache.org/jira/browse/HDFS-9275
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9275.01.patch, HDFS-9275.02.patch, 
> HDFS-9275.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976578#comment-14976578
 ] 

Hadoop QA commented on HDFS-9259:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 13s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 43s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  49m 23s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 32s | Tests passed in 
hadoop-hdfs-client. |
| | | 101m 44s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768842/HDFS-9259.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bcb2386 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13224/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13224/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13224/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13224/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13224/console |


This message was automatically generated.

> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7725) Incorrect "nodes in service" metrics caused all writes to fail

2015-10-27 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976626#comment-14976626
 ] 

Kihwal Lee commented on HDFS-7725:
--

Cherry-picked it to branch-2.7.

> Incorrect "nodes in service" metrics caused all writes to fail
> --
>
> Key: HDFS-7725
> URL: https://issues.apache.org/jira/browse/HDFS-7725
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-7725-2.patch, HDFS-7725-3.patch, HDFS-7725.patch
>
>
> One of our clusters sometimes couldn't allocate blocks from any DNs. 
> BlockPlacementPolicyDefault complains with the following messages for all DNs.
> {noformat}
> the node is too busy (load:x > y)
> {noformat}
> It turns out the {{HeartbeatManager}}'s {{nodesInService}} was computed 
> incorrectly when admins decomm or recomm dead nodes. Here are two scenarios.
> * Decomm dead nodes. It turns out HDFS-7374 has fixed it; not sure if it is 
> intentional. cc / [~zhz], [~andrew.wang], [~atm] Here is the sequence of 
> event without HDFS-7374.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == -1
> * However, HDFS-7374 introduces another inconsistency when recomm is involved.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == 0
> ** Recomm the node. nodesInService == 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9259:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

+1. Committed to trunk and branch-2. Thanks [~liuml07] for the contribution. 
Thanks [~cmccabe] for the suggestion.

> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7725) Incorrect "nodes in service" metrics caused all writes to fail

2015-10-27 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7725:
-
Labels: 2.7.2-candidate  (was: )

> Incorrect "nodes in service" metrics caused all writes to fail
> --
>
> Key: HDFS-7725
> URL: https://issues.apache.org/jira/browse/HDFS-7725
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-7725-2.patch, HDFS-7725-3.patch, HDFS-7725.patch
>
>
> One of our clusters sometimes couldn't allocate blocks from any DNs. 
> BlockPlacementPolicyDefault complains with the following messages for all DNs.
> {noformat}
> the node is too busy (load:x > y)
> {noformat}
> It turns out the {{HeartbeatManager}}'s {{nodesInService}} was computed 
> incorrectly when admins decomm or recomm dead nodes. Here are two scenarios.
> * Decomm dead nodes. It turns out HDFS-7374 has fixed it; not sure if it is 
> intentional. cc / [~zhz], [~andrew.wang], [~atm] Here is the sequence of 
> event without HDFS-7374.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == -1
> * However, HDFS-7374 introduces another inconsistency when recomm is involved.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == 0
> ** Recomm the node. nodesInService == 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7725) Incorrect "nodes in service" metrics caused all writes to fail

2015-10-27 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976523#comment-14976523
 ] 

Kuhu Shukla commented on HDFS-7725:
---

This issue manifested on 2.6 as prior to HDFS-7374 ( with count going to -1) 
and in 2.7 partially during recommissioning.
The unit test from this patch (testXceiverCount) fails on 2.7 during the 
recommission assert :
{code}
 //Verify recommission of dead node won't impact nodesInService metrics.
dnm.stopDecommission(dnd);
assertEquals(expectedInServiceNodes,getNumDNInService(namesystem));
{code}
It would be nice to have this patch ported to 2.7. [~mingma], any 
suggestions/comments would be helpful.

> Incorrect "nodes in service" metrics caused all writes to fail
> --
>
> Key: HDFS-7725
> URL: https://issues.apache.org/jira/browse/HDFS-7725
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-7725-2.patch, HDFS-7725-3.patch, HDFS-7725.patch
>
>
> One of our clusters sometimes couldn't allocate blocks from any DNs. 
> BlockPlacementPolicyDefault complains with the following messages for all DNs.
> {noformat}
> the node is too busy (load:x > y)
> {noformat}
> It turns out the {{HeartbeatManager}}'s {{nodesInService}} was computed 
> incorrectly when admins decomm or recomm dead nodes. Here are two scenarios.
> * Decomm dead nodes. It turns out HDFS-7374 has fixed it; not sure if it is 
> intentional. cc / [~zhz], [~andrew.wang], [~atm] Here is the sequence of 
> event without HDFS-7374.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == -1
> * However, HDFS-7374 introduces another inconsistency when recomm is involved.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == 0
> ** Recomm the node. nodesInService == 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9303) Balancer slowly with too many small file blocks

2015-10-27 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9303:

Attachment: HDFS-9303.002.patch

pull the recently code and add a new param #blockBytesNum.There is a little 
difference with HDFS-8824, with the balance param ,it can start balancer 
without change config value, it will be more convenient.This function is a  
strengthen for Balancer.

> Balancer slowly with too many small file blocks
> ---
>
> Key: HDFS-9303
> URL: https://issues.apache.org/jira/browse/HDFS-9303
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9303.001.patch, HDFS-9303.002.patch
>
>
> In the recent hadoop release versions I found that balance operation is 
> always so slowly, even though I upgrade the version.When I analyse balancer 
> log, I found that in every balance iteration,it use only 4 to 5 minutes, and 
> is a short time.And The most important is that the most of being moving 
> blocks is small blocks ,and the size is smaller than 1M.And this is a main 
> reason of the low effective of balance operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-27 Thread HeeSoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeeSoo Kim updated HDFS-7984:
-
Attachment: HDFS-7984.003.patch

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9303) Balancer slowly with too many small file blocks

2015-10-27 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976548#comment-14976548
 ] 

Lin Yiqun commented on HDFS-9303:
-

There is a little differece with HDFS-8824,you can see my latest patch.

> Balancer slowly with too many small file blocks
> ---
>
> Key: HDFS-9303
> URL: https://issues.apache.org/jira/browse/HDFS-9303
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9303.001.patch, HDFS-9303.002.patch
>
>
> In the recent hadoop release versions I found that balance operation is 
> always so slowly, even though I upgrade the version.When I analyse balancer 
> log, I found that in every balance iteration,it use only 4 to 5 minutes, and 
> is a short time.And The most important is that the most of being moving 
> blocks is small blocks ,and the size is smaller than 1M.And this is a main 
> reason of the low effective of balance operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9303) Balancer slowly with too many small file blocks

2015-10-27 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976549#comment-14976549
 ] 

Lin Yiqun commented on HDFS-9303:
-

There is a little differece with HDFS-8824,you can see my latest patch.

> Balancer slowly with too many small file blocks
> ---
>
> Key: HDFS-9303
> URL: https://issues.apache.org/jira/browse/HDFS-9303
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9303.001.patch, HDFS-9303.002.patch
>
>
> In the recent hadoop release versions I found that balance operation is 
> always so slowly, even though I upgrade the version.When I analyse balancer 
> log, I found that in every balance iteration,it use only 4 to 5 minutes, and 
> is a short time.And The most important is that the most of being moving 
> blocks is small blocks ,and the size is smaller than 1M.And this is a main 
> reason of the low effective of balance operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9303) Balancer slowly with too many small file blocks

2015-10-27 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976547#comment-14976547
 ] 

Lin Yiqun commented on HDFS-9303:
-

There is a little differece with HDFS-8824,you can see my latest patch.

> Balancer slowly with too many small file blocks
> ---
>
> Key: HDFS-9303
> URL: https://issues.apache.org/jira/browse/HDFS-9303
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9303.001.patch, HDFS-9303.002.patch
>
>
> In the recent hadoop release versions I found that balance operation is 
> always so slowly, even though I upgrade the version.When I analyse balancer 
> log, I found that in every balance iteration,it use only 4 to 5 minutes, and 
> is a short time.And The most important is that the most of being moving 
> blocks is small blocks ,and the size is smaller than 1M.And this is a main 
> reason of the low effective of balance operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list

2015-10-27 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976605#comment-14976605
 ] 

Kihwal Lee commented on HDFS-8950:
--

This has been a source of a lot of confusion lately. Should be fixed in 2.7.

> NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
> list
> 
>
> Key: HDFS-8950
> URL: https://issues.apache.org/jira/browse/HDFS-8950
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
> HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch
>
>
> If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
> refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
> It may try to allocate some blocks to that DN as well during an MR job.  This 
> issue is independent from DN decommission.
> To reproduce:
> 1. Add a DN to dfs_hosts_allow
> 2. Refresh NN
> 3. Start DN. Now NN starts seeing DN.
> 4. Stop DN
> 5. Remove DN from dfs_hosts_allow
> 6. Refresh NN -> NN is still reporting DN as being used by HDFS.
> This is different from decom because there DN is added to exclude list in 
> addition to being removed from allowed list, and in that case everything 
> works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list

2015-10-27 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976612#comment-14976612
 ] 

Kihwal Lee commented on HDFS-8950:
--

I tried to cherry-pick it, but {{TestDatanodeManager}} needs the following 
additional code.
{code:java}
  private static DatanodeManager mockDatanodeManager(
  FSNamesystem fsn, Configuration conf) throws IOException {
BlockManager bm = Mockito.mock(BlockManager.class);
DatanodeManager dm = new DatanodeManager(bm, fsn, conf);
return dm;
  }
{code}

After this, {{TestDecommission}}, {{TestDatanodeManager}} and 
{{TestHostFileManager}} all pass.

> NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
> list
> 
>
> Key: HDFS-8950
> URL: https://issues.apache.org/jira/browse/HDFS-8950
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
> HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch
>
>
> If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
> refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
> It may try to allocate some blocks to that DN as well during an MR job.  This 
> issue is independent from DN decommission.
> To reproduce:
> 1. Add a DN to dfs_hosts_allow
> 2. Refresh NN
> 3. Start DN. Now NN starts seeing DN.
> 4. Stop DN
> 5. Remove DN from dfs_hosts_allow
> 6. Refresh NN -> NN is still reporting DN as being used by HDFS.
> This is different from decom because there DN is added to exclude list in 
> addition to being removed from allowed list, and in that case everything 
> works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list

2015-10-27 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8950:
-
Labels: 2.7.2-candidate  (was: )

> NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
> list
> 
>
> Key: HDFS-8950
> URL: https://issues.apache.org/jira/browse/HDFS-8950
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
> HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch
>
>
> If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
> refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
> It may try to allocate some blocks to that DN as well during an MR job.  This 
> issue is independent from DN decommission.
> To reproduce:
> 1. Add a DN to dfs_hosts_allow
> 2. Refresh NN
> 3. Start DN. Now NN starts seeing DN.
> 4. Stop DN
> 5. Remove DN from dfs_hosts_allow
> 6. Refresh NN -> NN is still reporting DN as being used by HDFS.
> This is different from decom because there DN is added to exclude list in 
> addition to being removed from allowed list, and in that case everything 
> works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list

2015-10-27 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8950:
-
Attachment: HDFS-8950.branch-2.7.patch

Attaching the patch for branch-2.7.  Essentially the same patch plus the above 
static method in the test case.

> NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
> list
> 
>
> Key: HDFS-8950
> URL: https://issues.apache.org/jira/browse/HDFS-8950
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0
>
> Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
> HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch, 
> HDFS-8950.branch-2.7.patch
>
>
> If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
> refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
> It may try to allocate some blocks to that DN as well during an MR job.  This 
> issue is independent from DN decommission.
> To reproduce:
> 1. Add a DN to dfs_hosts_allow
> 2. Refresh NN
> 3. Start DN. Now NN starts seeing DN.
> 4. Stop DN
> 5. Remove DN from dfs_hosts_allow
> 6. Refresh NN -> NN is still reporting DN as being used by HDFS.
> This is different from decom because there DN is added to exclude list in 
> addition to being removed from allowed list, and in that case everything 
> works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9302) WebHDFS throws NullPointerException if newLength is not provided

2015-10-27 Thread Jagadesh Kiran N (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated HDFS-9302:
---
Attachment: HDFS-9302_00.patch

Attached the patch [~hitliuyi] ,please review 

> WebHDFS throws NullPointerException if newLength is not provided
> 
>
> Key: HDFS-9302
> URL: https://issues.apache.org/jira/browse/HDFS-9302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
> Environment: Centos6
>Reporter: Karthik Palaniappan
>Assignee: Jagadesh Kiran N
>Priority: Minor
> Attachments: HDFS-9302_00.patch
>
>
> $ curl -X POST "http://namenode:50070/webhdfs/v1/foo?op=truncate;
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> We should change newLength to be a required parameter in the webhdfs 
> documentation 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#New_Length),
>  and throw an IllegalArgumentException if isn't provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9302) WebHDFS throws NullPointerException if newLength is not provided

2015-10-27 Thread Jagadesh Kiran N (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated HDFS-9302:
---
Status: Patch Available  (was: Open)

> WebHDFS throws NullPointerException if newLength is not provided
> 
>
> Key: HDFS-9302
> URL: https://issues.apache.org/jira/browse/HDFS-9302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
> Environment: Centos6
>Reporter: Karthik Palaniappan
>Assignee: Jagadesh Kiran N
>Priority: Minor
> Attachments: HDFS-9302_00.patch
>
>
> $ curl -X POST "http://namenode:50070/webhdfs/v1/foo?op=truncate;
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> We should change newLength to be a required parameter in the webhdfs 
> documentation 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#New_Length),
>  and throw an IllegalArgumentException if isn't provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7725) Incorrect "nodes in service" metrics caused all writes to fail

2015-10-27 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7725:
-
Fix Version/s: (was: 2.8.0)
   2.7.2
   3.0.0

> Incorrect "nodes in service" metrics caused all writes to fail
> --
>
> Key: HDFS-7725
> URL: https://issues.apache.org/jira/browse/HDFS-7725
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: 2.7.2-candidate
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-7725-2.patch, HDFS-7725-3.patch, HDFS-7725.patch
>
>
> One of our clusters sometimes couldn't allocate blocks from any DNs. 
> BlockPlacementPolicyDefault complains with the following messages for all DNs.
> {noformat}
> the node is too busy (load:x > y)
> {noformat}
> It turns out the {{HeartbeatManager}}'s {{nodesInService}} was computed 
> incorrectly when admins decomm or recomm dead nodes. Here are two scenarios.
> * Decomm dead nodes. It turns out HDFS-7374 has fixed it; not sure if it is 
> intentional. cc / [~zhz], [~andrew.wang], [~atm] Here is the sequence of 
> event without HDFS-7374.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == -1
> * However, HDFS-7374 introduces another inconsistency when recomm is involved.
> ** Cluster has one live node. nodesInService == 1
> ** The node becomes dead. nodesInService == 0
> ** Decomm the node. nodesInService == 0
> ** Recomm the node. nodesInService == 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-27 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9279:
-
Target Version/s: 2.8.0

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch, 
> HDFS-9279-v3.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976685#comment-14976685
 ] 

Hadoop QA commented on HDFS-9313:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m  6s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m 11s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m  3s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 27s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 40s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 48s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 42s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  55m  9s | Tests failed in hadoop-hdfs. |
| | | 108m 46s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768993/HDFS-9313-2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bcb2386 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13225/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13225/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13225/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13225/console |


This message was automatically generated.

> Possible NullPointerException in BlockManager if no excess replica can be 
> chosen
> 
>
> Key: HDFS-9313
> URL: https://issues.apache.org/jira/browse/HDFS-9313
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9313-2.patch, HDFS-9313.patch
>
>
> HDFS-8647 makes it easier to reason about various block placement scenarios. 
> Here is one possible case where BlockManager won't be able to find the excess 
> replica to delete: when storage policy changes around the same time balancer 
> moves the block. When this happens, it will cause NullPointerException.
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978)
> {noformat}
> Note that it isn't found in any production clusters. Instead, it is found 
> from new unit tests. In addition, the issue has been there before HDFS-8647.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-6327) Clean up FSDirectory

2015-10-27 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-6327.
--
Resolution: Fixed

Closing this jira as all the subtasks have been completed.

> Clean up FSDirectory
> 
>
> Key: HDFS-6327
> URL: https://issues.apache.org/jira/browse/HDFS-6327
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This is an umbrella jira that coves the clean up work on the FSDirectory 
> class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976815#comment-14976815
 ] 

Chang Li commented on HDFS-9289:


[~zhz], I don't have the log show the file was completed with an old GS. But by 
look up the block from jsp page right now, I can see that the block 
blk_3773617405 currently has replica on host ***657n26.***.com, 
***656n04.***.com, and ***656n38.***.com,
by going to those datanode, I see the replica on those datanodes have replica 
with old genstamp.
{code}
bash-4.1$ hostname
***657n26.***.com
bash-4.1$ ls -l 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 23 18:00 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 23 18:00 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta

bash-4.1$ hostname
***656n04.***.com
bash-4.1$ ls -l 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 21 19:14 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 21 19:14 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta

bash-4.1$ hostname
***656n38.***.com
bash-4.1$ ls -l 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 23 09:14 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 23 09:14 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta
{code}

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen

2015-10-27 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976853#comment-14976853
 ] 

Ming Ma commented on HDFS-9313:
---

Thanks [~zhz]. If it continues, given no state has changed or no alternative 
approach is taken, it will just keep looping.

> Possible NullPointerException in BlockManager if no excess replica can be 
> chosen
> 
>
> Key: HDFS-9313
> URL: https://issues.apache.org/jira/browse/HDFS-9313
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9313-2.patch, HDFS-9313.patch
>
>
> HDFS-8647 makes it easier to reason about various block placement scenarios. 
> Here is one possible case where BlockManager won't be able to find the excess 
> replica to delete: when storage policy changes around the same time balancer 
> moves the block. When this happens, it will cause NullPointerException.
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978)
> {noformat}
> Note that it isn't found in any production clusters. Instead, it is found 
> from new unit tests. In addition, the issue has been there before HDFS-8647.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-27 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9299:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

committed to 2.8.  thanks, [~sfriberg]

> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9302) WebHDFS throws NullPointerException if newLength is not provided

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976756#comment-14976756
 ] 

Hadoop QA commented on HDFS-9302:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 56s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 22s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 33s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  50m  0s | Tests failed in hadoop-hdfs. |
| | |  95m 49s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12769001/HDFS-9302_00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c28e16b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13226/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13226/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13226/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13226/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13226/console |


This message was automatically generated.

> WebHDFS throws NullPointerException if newLength is not provided
> 
>
> Key: HDFS-9302
> URL: https://issues.apache.org/jira/browse/HDFS-9302
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
> Environment: Centos6
>Reporter: Karthik Palaniappan
>Assignee: Jagadesh Kiran N
>Priority: Minor
> Attachments: HDFS-9302_00.patch
>
>
> $ curl -X POST "http://namenode:50070/webhdfs/v1/foo?op=truncate;
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}
> We should change newLength to be a required parameter in the webhdfs 
> documentation 
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#New_Length),
>  and throw an IllegalArgumentException if isn't provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-27 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.009.patch

Fix for timed out test 
org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks 

Need to remove from iterator and not from tree during iteration to avoid 
concurrent modification exception

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9303) Balancer slowly with too many small file blocks

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976915#comment-14976915
 ] 

Hadoop QA commented on HDFS-9303:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 11s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  11m 13s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  14m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 33s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 11s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   2m  9s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 46s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 23s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  58m  4s | Tests failed in hadoop-hdfs. |
| | | 115m 46s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.fs.contract.hdfs.TestHDFSContractSetTimes |
|   | hadoop.hdfs.TestWriteReadStripedFile |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSeek |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.fs.TestWebHdfsFileContextMainOperations |
|   | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
| Timed out tests | 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | org.apache.hadoop.fs.viewfs.TestViewFsFileStatusHdfs |
|   | org.apache.hadoop.hdfs.TestDFSAddressConfig |
|   | org.apache.hadoop.fs.contract.hdfs.TestHDFSContractRename |
|   | org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum |
|   | org.apache.hadoop.fs.shell.TestHdfsTextCommand |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12769005/HDFS-9303.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / aa09880 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13228/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13228/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13228/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13228/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13228/console |


This message was automatically generated.

> Balancer slowly with too many small file blocks
> ---
>
> Key: HDFS-9303
> URL: https://issues.apache.org/jira/browse/HDFS-9303
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9303.001.patch, HDFS-9303.002.patch
>
>
> In the recent hadoop release versions I found that balance operation is 
> always so slowly, even though I upgrade the version.When I analyse balancer 
> log, I found that in every balance iteration,it use only 4 to 5 minutes, and 
> is a short time.And The most important is that the most of being moving 
> blocks is small blocks ,and the size is smaller than 1M.And this is a main 
> reason of the low effective of balance operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9168) Move client side unit test to hadoop-hdfs-client

2015-10-27 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976841#comment-14976841
 ] 

Jing Zhao commented on HDFS-9168:
-

Patch looks good to me. One nit is that we can use this chance to fix the 
condition (2) in the following javadoc
{code}
37 *   Add a new datanode only if r >= 3 and either
38 *   (1) floor(r/2) >= n; or
39 *   (2) r > n and the block is hflushed/appended.
40 */
{code}

Also please generally explain why the changes in {{ReplaceDatanodeOnFailure}} 
is necessary. +1 after addressing the comments.

> Move client side unit test to hadoop-hdfs-client
> 
>
> Key: HDFS-9168
> URL: https://issues.apache.org/jira/browse/HDFS-9168
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9168.000.patch, HDFS-9168.001.patch, 
> HDFS-9168.002.patch, HDFS-9168.003.patch
>
>
> We need to identify and move the unit tests on the client of hdfs to the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9168) Move client side unit test to hadoop-hdfs-client

2015-10-27 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9168:
-
Attachment: HDFS-9168.004.patch

> Move client side unit test to hadoop-hdfs-client
> 
>
> Key: HDFS-9168
> URL: https://issues.apache.org/jira/browse/HDFS-9168
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9168.000.patch, HDFS-9168.001.patch, 
> HDFS-9168.002.patch, HDFS-9168.003.patch, HDFS-9168.004.patch
>
>
> We need to identify and move the unit tests on the client of hdfs to the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9168) Move client side unit test to hadoop-hdfs-client

2015-10-27 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976858#comment-14976858
 ] 

Haohui Mai commented on HDFS-9168:
--

Thanks Jing for the reviews. Uploaded the v4 patch to address the comment.

The changes in {{ReplaceDatanodeOnFailure}} is to avoid the findbugs warnings 
in the v2 patch. {{java.lang.Enum}} implements {{Serializable}}. The old 
implementation puts the fields in enum, which generates a findbug warning.

> Move client side unit test to hadoop-hdfs-client
> 
>
> Key: HDFS-9168
> URL: https://issues.apache.org/jira/browse/HDFS-9168
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9168.000.patch, HDFS-9168.001.patch, 
> HDFS-9168.002.patch, HDFS-9168.003.patch, HDFS-9168.004.patch
>
>
> We need to identify and move the unit tests on the client of hdfs to the 
> hdfs-client module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9311) Support optional offload of NameNode HA service health checks to a separate RPC server.

2015-10-27 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976877#comment-14976877
 ] 

Jitendra Nath Pandey commented on HDFS-9311:


[~cnauroth], thanks for creating a separate jira and adding this improvement 
for health check first.
The patch looks very good. One comment:
Health check needs very few threads. The ratio of 0.1 could mean more than 10 
threads for many large clusters which may be a waste. 

> Support optional offload of NameNode HA service health checks to a separate 
> RPC server.
> ---
>
> Key: HDFS-9311
> URL: https://issues.apache.org/jira/browse/HDFS-9311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9311.001.patch
>
>
> When a NameNode is overwhelmed with load, it can lead to resource exhaustion 
> of the RPC handler pools (both client-facing and service-facing).  
> Eventually, this blocks the health check RPC issued from ZKFC, which triggers 
> a failover.  Depending on fencing configuration, the former active NameNode 
> may be killed.  In an overloaded situation, the new active NameNode is likely 
> to suffer the same fate, because client load patterns don't change after the 
> failover.  This can degenerate into flapping between the 2 NameNodes without 
> real recovery.  If a NameNode had been killed by fencing, then it would have 
> to transition through safe mode, further delaying time to recovery.
> This issue proposes a separate, optional RPC server at the NameNode for 
> isolating the HA health checks.  These health checks are lightweight 
> operations that do not suffer from contention issues on the namesystem lock 
> or other shared resources.  Isolating the RPC handlers is sufficient to avoid 
> this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976942#comment-14976942
 ] 

Hudson commented on HDFS-9259:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #543 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/543/])
HDFS-9259. Make SO_SNDBUF size configurable at DFSClient side for hdfs (mingma: 
rev aa09880ab85f3c35c12373976e7b03f3140b65c8)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientSocketSize.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976735#comment-14976735
 ] 

Zhe Zhang commented on HDFS-9289:
-

bq. the client after updatepipeline with the new gen stamp it later completed 
file with the old gen stamp
This looks very strange. But why do you think this happened? Did you see logs 
that the file was completed with an old GS?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976794#comment-14976794
 ] 

Zhe Zhang commented on HDFS-9313:
-

Thanks Ming for the fix. I haven't dived into full details of the two 
{{chooseReplicaToDelete}} methods, but a quick question is why we should 
{{break}} instead of {{continue}} when seeing a null value?

> Possible NullPointerException in BlockManager if no excess replica can be 
> chosen
> 
>
> Key: HDFS-9313
> URL: https://issues.apache.org/jira/browse/HDFS-9313
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9313-2.patch, HDFS-9313.patch
>
>
> HDFS-8647 makes it easier to reason about various block placement scenarios. 
> Here is one possible case where BlockManager won't be able to find the excess 
> replica to delete: when storage policy changes around the same time balancer 
> moves the block. When this happens, it will cause NullPointerException.
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978)
> {noformat}
> Note that it isn't found in any production clusters. Instead, it is found 
> from new unit tests. In addition, the issue has been there before HDFS-8647.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976871#comment-14976871
 ] 

Hudson commented on HDFS-9259:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8715 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8715/])
HDFS-9259. Make SO_SNDBUF size configurable at DFSClient side for hdfs (mingma: 
rev aa09880ab85f3c35c12373976e7b03f3140b65c8)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientSocketSize.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-27 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9279:
--
Attachment: HDFS-9279-v4.patch

Updated patch that keeps the XceiverCount update as-is. Changed the Unit test 
accordingly. Only TestNamenodeCapacityReport test failure was related to my 
patch, rest of the test failures are not seen locally.

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch, 
> HDFS-9279-v3.patch, HDFS-9279-v4.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976925#comment-14976925
 ] 

Zhe Zhang commented on HDFS-9289:
-

The fact that all 3 DNs have old GS doesn't mean the client also has an old GS. 
Is the above log from the same cluster as previous [logs | 
https://issues.apache.org/jira/browse/HDFS-9289?focusedCommentId=14972655=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14972655]?

In these cases, is there any replica with the correct (new) GS? If so it 
doesn't look a bug. If all replicas of a block have old GS, then it's more 
suspicious.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9307) fuseConnect should be private to fuse_connect.c

2015-10-27 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9307:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> fuseConnect should be private to fuse_connect.c
> ---
>
> Key: HDFS-9307
> URL: https://issues.apache.org/jira/browse/HDFS-9307
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Colin Patrick McCabe
>Assignee: Mingliang Liu
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9307.000.patch, HDFS-9307.001.patch
>
>
> fuseConnect should be private to fuse_connect.c, since it's not used outside 
> that file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9307) fuseConnect should be private to fuse_connect.c

2015-10-27 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976939#comment-14976939
 ] 

Colin Patrick McCabe commented on HDFS-9307:


+1.  Thanks, [~liuml07].

> fuseConnect should be private to fuse_connect.c
> ---
>
> Key: HDFS-9307
> URL: https://issues.apache.org/jira/browse/HDFS-9307
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Colin Patrick McCabe
>Assignee: Mingliang Liu
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9307.000.patch, HDFS-9307.001.patch
>
>
> fuseConnect should be private to fuse_connect.c, since it's not used outside 
> that file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977044#comment-14977044
 ] 

Hudson commented on HDFS-9259:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #604 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/604/])
HDFS-9259. Make SO_SNDBUF size configurable at DFSClient side for hdfs (mingma: 
rev aa09880ab85f3c35c12373976e7b03f3140b65c8)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientSocketSize.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9307) fuseConnect should be private to fuse_connect.c

2015-10-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977072#comment-14977072
 ] 

Mingliang Liu commented on HDFS-9307:
-

Thank you [~cmccabe] for your reporting this jira, reviewing and committing the 
final patch.

> fuseConnect should be private to fuse_connect.c
> ---
>
> Key: HDFS-9307
> URL: https://issues.apache.org/jira/browse/HDFS-9307
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Colin Patrick McCabe
>Assignee: Mingliang Liu
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9307.000.patch, HDFS-9307.001.patch
>
>
> fuseConnect should be private to fuse_connect.c, since it's not used outside 
> that file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977156#comment-14977156
 ] 

Chang Li commented on HDFS-9289:


[~zhz], yes, the above log is from the same cluster as the first log I post.

The two replicas in two datanodes from updated pipeline had new GS but they 
were marked as corrupt because the block commit with old genstamp. 
The complete story happened in that cluster is:  there were initially 3 
datanodes in pipeline d1, d2, d3. Then pipelineupdate happen with only d2 and 
d3 with new GS. Then file complete with old GS and d2 and d3 were marked 
corrupt. Then after 1 day, full block report from d1 came in, and NN found out 
d1 has the the right block with "correct" old GS but d1 is under replicated, so 
NN told d1 to replicate its replica with old GS to the other two nodes, d4, d5. 
So the all 3DNs I showed above were d1, d4, and d5 having old GS.
I think there probabaly exist some cache coherence issue since 
{code}protected ExtendedBlock block;{code}
lack volatile. That could also explain why this issue didn't happen frequently.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977012#comment-14977012
 ] 

Hudson commented on HDFS-9259:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #1327 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1327/])
HDFS-9259. Make SO_SNDBUF size configurable at DFSClient side for hdfs (mingma: 
rev aa09880ab85f3c35c12373976e7b03f3140b65c8)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientSocketSize.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977092#comment-14977092
 ] 

Hudson commented on HDFS-9299:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #8716 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8716/])
HDFS-9299. Give ReplicationMonitor a readable thread name (Staffan (cmccabe: 
rev fe93577faf49ceb2ee47a7762a61625313ea773b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9307) fuseConnect should be private to fuse_connect.c

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977087#comment-14977087
 ] 

Hudson commented on HDFS-9307:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #8716 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8716/])
HDFS-9307. fuseConnect should be private to fuse_connect.c (Mingliang (cmccabe: 
rev faeb6a3f89f3580a5b1a40c6a1f6205269a5aa7a)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.h
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c


> fuseConnect should be private to fuse_connect.c
> ---
>
> Key: HDFS-9307
> URL: https://issues.apache.org/jira/browse/HDFS-9307
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Colin Patrick McCabe
>Assignee: Mingliang Liu
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9307.000.patch, HDFS-9307.001.patch
>
>
> fuseConnect should be private to fuse_connect.c, since it's not used outside 
> that file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7163) WebHdfsFileSystem should retry reads according to the configured retry policy.

2015-10-27 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7163:
-
Description: 
In the current implementation of WebHdfsFileSystem, opens are retried according 
to the configured retry policy, but not reads. Therefore, if a connection goes 
down while data is being read, the read will fail and the read will have to be 
retried by the client code.

Also, after a connection has been established, the next read (or seek/read) 
will fail and the read will have to be restarted by the client code.
Summary: WebHdfsFileSystem should retry reads according to the 
configured retry policy.  (was: WebHdfsFileSystem should retry reads in a 
similar way as the open)

> WebHdfsFileSystem should retry reads according to the configured retry policy.
> --
>
> Key: HDFS-7163
> URL: https://issues.apache.org/jira/browse/HDFS-7163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.5.1
>Reporter: Eric Payne
>Assignee: Eric Payne
>
> In the current implementation of WebHdfsFileSystem, opens are retried 
> according to the configured retry policy, but not reads. Therefore, if a 
> connection goes down while data is being read, the read will fail and the 
> read will have to be retried by the client code.
> Also, after a connection has been established, the next read (or seek/read) 
> will fail and the read will have to be restarted by the client code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7163) WebHdfsFileSystem should retry reads according to the configured retry policy.

2015-10-27 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7163:
-
Attachment: HDFS-7163.001.patch

> WebHdfsFileSystem should retry reads according to the configured retry policy.
> --
>
> Key: HDFS-7163
> URL: https://issues.apache.org/jira/browse/HDFS-7163
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.5.1
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: HDFS-7163.001.patch
>
>
> In the current implementation of WebHdfsFileSystem, opens are retried 
> according to the configured retry policy, but not reads. Therefore, if a 
> connection goes down while data is being read, the read will fail and the 
> read will have to be retried by the client code.
> Also, after a connection has been established, the next read (or seek/read) 
> will fail and the read will have to be restarted by the client code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9311) Support optional offload of NameNode HA service health checks to a separate RPC server.

2015-10-27 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9311:

Attachment: HDFS-9311.002.patch

[~jnp], thank you for the review.  I agree.  The handler ratio was a holdover 
from splitting away from the HDFS-9239 patch.  For the scope of this issue, we 
don't need that many threads.

I'm uploading patch v002 with these changes:
# Adjusted the threads as per Jitendra's feedback.
# Corrected some mocking that was causing failures in 
{{TestZKFailoverController}} and {{TestZKFailoverControllerStress}} in the last 
Jenkins run.
# Corrected Checkstyle and whitespace warning.

> Support optional offload of NameNode HA service health checks to a separate 
> RPC server.
> ---
>
> Key: HDFS-9311
> URL: https://issues.apache.org/jira/browse/HDFS-9311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9311.001.patch, HDFS-9311.002.patch
>
>
> When a NameNode is overwhelmed with load, it can lead to resource exhaustion 
> of the RPC handler pools (both client-facing and service-facing).  
> Eventually, this blocks the health check RPC issued from ZKFC, which triggers 
> a failover.  Depending on fencing configuration, the former active NameNode 
> may be killed.  In an overloaded situation, the new active NameNode is likely 
> to suffer the same fate, because client load patterns don't change after the 
> failover.  This can degenerate into flapping between the 2 NameNodes without 
> real recovery.  If a NameNode had been killed by fencing, then it would have 
> to transition through safe mode, further delaying time to recovery.
> This issue proposes a separate, optional RPC server at the NameNode for 
> isolating the HA health checks.  These health checks are lightweight 
> operations that do not suffer from contention issues on the namesystem lock 
> or other shared resources.  Isolating the RPC handlers is sufficient to avoid 
> this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9259) Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977037#comment-14977037
 ] 

Hudson commented on HDFS-9259:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #591 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/591/])
HDFS-9259. Make SO_SNDBUF size configurable at DFSClient side for hdfs (mingma: 
rev aa09880ab85f3c35c12373976e7b03f3140b65c8)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientSocketSize.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


> Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
> --
>
> Key: HDFS-9259
> URL: https://issues.apache.org/jira/browse/HDFS-9259
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9259.000.patch, HDFS-9259.001.patch
>
>
> We recently found that cross-DC hdfs write could be really slow. Further 
> investigation identified that is due to SendBufferSize and ReceiveBufferSize 
> used for hdfs write. The test ran "hadoop -fs -copyFromLocal" of a 256MB file 
> across DC with different SendBufferSize and ReceiveBufferSize values. The 
> results showed that c much faster than b; b is faster than a.
> a. SendBufferSize=128k, ReceiveBufferSize=128k (hdfs default setting).
> b. SendBufferSize=128K, ReceiveBufferSize=not set(TCP auto tuning).
> c. SendBufferSize=not set, ReceiveBufferSize=not set(TCP auto tuning for both)
> HDFS-8829 has enabled scenario b. We would like to enable scenario c by 
> making SendBufferSize configurable at DFSClient side. Cc: [~cmccabe] [~He 
> Tianyi] [~kanaka] [~vinayrpet].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9164) hdfs-nfs connector fails on O_TRUNC

2015-10-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977067#comment-14977067
 ] 

Mingliang Liu commented on HDFS-9164:
-

You may need to "Submit Patch" to trigger Jenkins build.

> hdfs-nfs connector fails on O_TRUNC
> ---
>
> Key: HDFS-9164
> URL: https://issues.apache.org/jira/browse/HDFS-9164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Constantine Peresypkin
> Attachments: HDFS-9164.1.patch
>
>
> Linux NFS client will issue `open(.. O_TRUNC); write()` when overwriting a 
> file that's in nfs client cache (to not evict the inode, probably). Which 
> will spectacularly fail on hdfs-nfs with I/O error.
> Example:
> $ cp /some/file /to/hdfs/mount/
> $ cp /some/file /to/hdfs/mount/
> I/O error
> The first write will pass if the file is not in cache, the second one will 
> always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9255) Consolidate block recovery related implementation into a single class

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977111#comment-14977111
 ] 

Zhe Zhang commented on HDFS-9255:
-

Thanks Walter for the update. +1 on the latest patch (listing a few nits that 
can be done as follow-on). [~jingzhao] Do you have further comments?
# {{BlockRecoveryWorker#recoverBlocks}} doesn't need a return value
# Even though DN now initializes the recovery worker before block pool manager, 
{{BPOfferService}} can still use a null check at {{dn.getBlockRecoveryWorker()}}

> Consolidate block recovery related implementation into a single class
> -
>
> Key: HDFS-9255
> URL: https://issues.apache.org/jira/browse/HDFS-9255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Attachments: HDFS-9255.01.patch, HDFS-9255.02.patch, 
> HDFS-9255.03.patch, HDFS-9255.04.patch, HDFS-9255.05.patch, HDFS-9255.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-9289:
---
Attachment: HDFS-9289.3.patch

.3 patch set block in DataStreamer to volatile

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976995#comment-14976995
 ] 

Zhe Zhang commented on HDFS-9229:
-

Great work Surendra! The patch looks good overall. A few comments:

# We should add Javadoc for {{nameDirSizeMap}}
# The behavior of shared directories is worth more discussions. The current 
patch returns {{0}} if the directory is shared. Since the purpose of this new 
metric is for local storage planning / provisioning, shall we report size of 
shared dirs as well?

Nits:
# Instead of directly using {{isShared}} variable, we should call {{isShared()}}
# The temporary {{nnDirSizeMap}} is not necessary. I think we can directly 
clear and add to {{nameDirSizeMap}}.

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-9229.001.patch, HDFS-9229.002.patch, 
> HDFS-9229.003.patch, HDFS-9229.004.patch, HDFS-9229.005.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9318) considerLoad factor can be improved

2015-10-27 Thread Kuhu Shukla (JIRA)
Kuhu Shukla created HDFS-9318:
-

 Summary: considerLoad factor can be improved
 Key: HDFS-9318
 URL: https://issues.apache.org/jira/browse/HDFS-9318
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Kuhu Shukla
Assignee: Kuhu Shukla


Currently considerLoad avoids choosing nodes that are too active, so it helps 
level the HDFS load across the cluster. Under normal conditions, this is 
desired. However, when a cluster has a large percentage of nearly full nodes, 
this can make it difficult to find good targets because the placement policy 
wants to avoid the full nodes, but considerLoad wants to avoid the busy 
less-full nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9083) Replication violates block placement policy.

2015-10-27 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977073#comment-14977073
 ] 

Jing Zhao commented on HDFS-9083:
-

The patch looks good to me. [~mingma] and [~brahmareddy], do you also want to 
take a look at the patch since you guys worked on HDFS-8647.

> Replication violates block placement policy.
> 
>
> Key: HDFS-9083
> URL: https://issues.apache.org/jira/browse/HDFS-9083
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS, namenode
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Blocker
> Attachments: HDFS-9083-branch-2.7.patch
>
>
> Recently we are noticing many cases in which all the replica of the block are 
> residing on the same rack.
> During the block creation, the block placement policy was honored.
> But after node failure event in some specific manner, the block ends up in 
> such state.
> On investigating more I found out that BlockManager#blockHasEnoughRacks is 
> dependent on the config (net.topology.script.file.name)
> {noformat}
>  if (!this.shouldCheckForEnoughRacks) {
>   return true;
> }
> {noformat}
> We specify DNSToSwitchMapping implementation (our own custom implementation) 
> via net.topology.node.switch.mapping.impl and no longer use 
> net.topology.script.file.name config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9312) Fix TestReplication to be FsDataset-agnostic.

2015-10-27 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9312:

Affects Version/s: 2.7.1
 Target Version/s: 3.0.0, 2.8.0
   Status: Patch Available  (was: In Progress)

> Fix TestReplication to be FsDataset-agnostic.
> -
>
> Key: HDFS-9312
> URL: https://issues.apache.org/jira/browse/HDFS-9312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9312.00.patch
>
>
> {{TestReplication}} uses raw file system access to inject dummy replica 
> files. It makes {{TestReplication}} not compatible to non-fs dataset 
> implementation.
> We can fix it by using existing {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9267) TestDiskError should get stored replicas through FsDatasetTestUtils.

2015-10-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976993#comment-14976993
 ] 

Lei (Eddy) Xu commented on HDFS-9267:
-

[~cmccabe]  thanks a lot for the suggestions.

bq. Do you think it would be better to have an Iterator here? 

In {{getStoredReplicas()}}, it needs to scan and load all on-disk replicas into 
a local {{replicaMap}}, to verify the contents on the disk. Returning a 
iterator of this local {{replicaMap}} has the same space complexity as 
returning a Collection, because this {{replicaMap}} is still referred by the 
iterator.  Also it is less readable to implement {{isEmpty()}} using iterator 
(i.e., using {{it.hasNext()}}) in the following code:

{code}
while (!utils.getStoredReplicas(bpid).isEmpty()) {
Thread.sleep(100);
}
{code}

bq. That collection of all the replicas in the dataset could get pretty big in 
theory.

{{FsDatasetTestUtils}} is only used by {{HDFS}} unit tests, which should not 
have millions of blocks in one test.  Will {{Hbase}} or other projects use this 
function?  If the space is a concern, we could write a replica Scanner in the 
future. 

What do you think?

> TestDiskError should get stored replicas through FsDatasetTestUtils.
> 
>
> Key: HDFS-9267
> URL: https://issues.apache.org/jira/browse/HDFS-9267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9267.00.patch, HDFS-9267.01.patch, 
> HDFS-9267.02.patch
>
>
> {{TestDiskError#testReplicationError}} scans local directories to verify 
> blocks and metadata files, which leaks the details of {{FsDataset}} 
> implementation. 
> This JIRA will abstract the "scanning" operation to {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9164) hdfs-nfs connector fails on O_TRUNC

2015-10-27 Thread Constantine Peresypkin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constantine Peresypkin updated HDFS-9164:
-
Assignee: Constantine Peresypkin
  Status: Patch Available  (was: Open)

> hdfs-nfs connector fails on O_TRUNC
> ---
>
> Key: HDFS-9164
> URL: https://issues.apache.org/jira/browse/HDFS-9164
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Constantine Peresypkin
>Assignee: Constantine Peresypkin
> Attachments: HDFS-9164.1.patch
>
>
> Linux NFS client will issue `open(.. O_TRUNC); write()` when overwriting a 
> file that's in nfs client cache (to not evict the inode, probably). Which 
> will spectacularly fail on hdfs-nfs with I/O error.
> Example:
> $ cp /some/file /to/hdfs/mount/
> $ cp /some/file /to/hdfs/mount/
> I/O error
> The first write will pass if the file is not in cache, the second one will 
> always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9311) Support optional offload of NameNode HA service health checks to a separate RPC server.

2015-10-27 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9311:

Attachment: HDFS-9311.003.patch

I'm attaching patch v003.  The only difference compared to v002 is the log 
message in {{NameNode#setRpcLifelineServerAddress}}.  The message text had been 
talking about DataNodes.  It was another holdover from splitting this out from 
the HDFS-9239 work.  I generalized the text to "Setting lifeline RPC address".

> Support optional offload of NameNode HA service health checks to a separate 
> RPC server.
> ---
>
> Key: HDFS-9311
> URL: https://issues.apache.org/jira/browse/HDFS-9311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9311.001.patch, HDFS-9311.002.patch, 
> HDFS-9311.003.patch
>
>
> When a NameNode is overwhelmed with load, it can lead to resource exhaustion 
> of the RPC handler pools (both client-facing and service-facing).  
> Eventually, this blocks the health check RPC issued from ZKFC, which triggers 
> a failover.  Depending on fencing configuration, the former active NameNode 
> may be killed.  In an overloaded situation, the new active NameNode is likely 
> to suffer the same fate, because client load patterns don't change after the 
> failover.  This can degenerate into flapping between the 2 NameNodes without 
> real recovery.  If a NameNode had been killed by fencing, then it would have 
> to transition through safe mode, further delaying time to recovery.
> This issue proposes a separate, optional RPC server at the NameNode for 
> isolating the HA health checks.  These health checks are lightweight 
> operations that do not suffer from contention issues on the namesystem lock 
> or other shared resources.  Isolating the RPC handlers is sufficient to avoid 
> this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9307) fuseConnect should be private to fuse_connect.c

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977311#comment-14977311
 ] 

Hudson commented on HDFS-9307:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2535 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2535/])
HDFS-9307. fuseConnect should be private to fuse_connect.c (Mingliang (cmccabe: 
rev faeb6a3f89f3580a5b1a40c6a1f6205269a5aa7a)
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.h
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> fuseConnect should be private to fuse_connect.c
> ---
>
> Key: HDFS-9307
> URL: https://issues.apache.org/jira/browse/HDFS-9307
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Colin Patrick McCabe
>Assignee: Mingliang Liu
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9307.000.patch, HDFS-9307.001.patch
>
>
> fuseConnect should be private to fuse_connect.c, since it's not used outside 
> that file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977317#comment-14977317
 ] 

Hudson commented on HDFS-9299:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2535 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2535/])
HDFS-9299. Give ReplicationMonitor a readable thread name (Staffan (cmccabe: 
rev fe93577faf49ceb2ee47a7762a61625313ea773b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9311) Support optional offload of NameNode HA service health checks to a separate RPC server.

2015-10-27 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977194#comment-14977194
 ] 

Jitendra Nath Pandey commented on HDFS-9311:


+1

> Support optional offload of NameNode HA service health checks to a separate 
> RPC server.
> ---
>
> Key: HDFS-9311
> URL: https://issues.apache.org/jira/browse/HDFS-9311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-9311.001.patch, HDFS-9311.002.patch, 
> HDFS-9311.003.patch
>
>
> When a NameNode is overwhelmed with load, it can lead to resource exhaustion 
> of the RPC handler pools (both client-facing and service-facing).  
> Eventually, this blocks the health check RPC issued from ZKFC, which triggers 
> a failover.  Depending on fencing configuration, the former active NameNode 
> may be killed.  In an overloaded situation, the new active NameNode is likely 
> to suffer the same fate, because client load patterns don't change after the 
> failover.  This can degenerate into flapping between the 2 NameNodes without 
> real recovery.  If a NameNode had been killed by fencing, then it would have 
> to transition through safe mode, further delaying time to recovery.
> This issue proposes a separate, optional RPC server at the NameNode for 
> isolating the HA health checks.  These health checks are lightweight 
> operations that do not suffer from contention issues on the namesystem lock 
> or other shared resources.  Isolating the RPC handlers is sufficient to avoid 
> this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9312) Fix TestReplication to be FsDataset-agnostic.

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977230#comment-14977230
 ] 

Zhe Zhang commented on HDFS-9312:
-

Thanks for the work Eddy. The patch LGTM overall.

The only issue I see is that we should have a better name for 
{{injectReplicas}}: 1) it only injects 1 replica; 2) it took me a while to 
realize it injects a _corrupt_ replica. Maybe something like 
{{injectCorruptReplica}}?

+1 pending Jenkins and after addressing the above.

> Fix TestReplication to be FsDataset-agnostic.
> -
>
> Key: HDFS-9312
> URL: https://issues.apache.org/jira/browse/HDFS-9312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-9312.00.patch
>
>
> {{TestReplication}} uses raw file system access to inject dummy replica 
> files. It makes {{TestReplication}} not compatible to non-fs dataset 
> implementation.
> We can fix it by using existing {{FsDatasetTestUtils}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977236#comment-14977236
 ] 

Hadoop QA commented on HDFS-9279:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  24m 18s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  10m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  13m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 32s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 54s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 59s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 45s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 18s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  82m 24s | Tests failed in hadoop-hdfs. |
| | | 144m  0s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.web.TestWebHdfsUrl |
|   | hadoop.hdfs.web.TestWebHdfsTokens |
|   | hadoop.hdfs.server.datanode.TestBlockReplacement |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12769047/HDFS-9279-v4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5c24fe7 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13230/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13230/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13230/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13230/console |


This message was automatically generated.

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch, 
> HDFS-9279-v3.patch, HDFS-9279-v4.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >