date:20151030

[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-10-30 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982092#comment-14982092
 ] 

Rakesh R commented on HDFS-8777:


Thanks a lot [~zhz] for the useful comments.
- 1, 2, 3 >> Agreed and will update it when creating another patch.

- 4 >> yeah, I think it is possible to add strict validation by checking 
{{iip.getLastINode() == null}} and throw exception. Probably will keep the 
condition [here 
FSDirErasureCodingOp.java#L229|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirErasureCodingOp.java#L229].
 How about raise a minor sub-task under HDFS-8031 and discuss there separately?
- 5 >> To test erasure coding policy + snapshot behavior I think we need to add 
extra code and do assertions in between. IMHO would be good to keep ec policy 
tests separately. Does it make sense to you?

> Erasure Coding: add tests for taking snapshots on EC files
> --
>
> Key: HDFS-8777
> URL: https://issues.apache.org/jira/browse/HDFS-8777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Jing Zhao
>Assignee: Rakesh R
>  Labels: test
> Attachments: HDFS-8777-01.patch, HDFS-8777-02.patch, 
> HDFS-8777-HDFS-7285-00.patch, HDFS-8777-HDFS-7285-01.patch
>
>
> We need to add more tests for (EC + snapshots). The tests need to verify the 
> fsimage saving/loading is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982187#comment-14982187
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2547 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2547/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed

2015-10-30 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen reassigned HDFS-9344:
---

Assignee: Xiao Chen  (was: Jagadesh Kiran N)

> QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
> -
>
> Key: HDFS-9344
> URL: https://issues.apache.org/jira/browse/HDFS-9344
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Xiao Chen
>Priority: Minor
>
> After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in 
> {{UnderReplicatedBlocks}} is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981988#comment-14981988
 ] 

Hadoop QA commented on HDFS-9323:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 21 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 31s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 49s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 125m 52s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
| JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | hadoop.hdfs.TestWriteReadStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-10-30 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12769212/h9323_20151028.patch |
| JIRA Issue | HDFS-9323 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 849c09713875 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build

[jira] [Commented] (HDFS-9275) Wait previous ErasureCodingWork to finish before schedule another one

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982022#comment-14982022
 ] 

Hadoop QA commented on HDFS-9275:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 4s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 161, now 161). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 57s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 29s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
| JDK v1.7.0_79 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   |

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982020#comment-14982020
 ] 

Walter Su commented on HDFS-9236:
-

If a buggy DN does return RUR without throwing {{RecoveryInProgressException}}, 
please put the checking in the {{recover()}} after 
{{callInitReplicaRecovery()}}.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-9323:
--
Comment: was deleted

(was: Filed HDFS-9346 testMultipleDatanodeFailure56.)

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982078#comment-14982078
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9323:
---

Filed HDFS-9346 testMultipleDatanodeFailure56.

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed

2015-10-30 Thread Zhe Zhang (JIRA)

Zhe Zhang created HDFS-9344:
---

 Summary: QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
 Key: HDFS-9344
 URL: https://issues.apache.org/jira/browse/HDFS-9344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Priority: Minor


After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in 
{{UnderReplicatedBlocks}} is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9346) TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56 may throw NPE

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)

Tsz Wo Nicholas Sze created HDFS-9346:
-

 Summary: 
TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56 may throw 
NPE
 Key: HDFS-9346
 URL: https://issues.apache.org/jira/browse/HDFS-9346
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Priority: Minor


See the NPE in [build 
13294#|https://builds.apache.org/job/PreCommit-HDFS-Build/13294/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testMultipleDatanodeFailure56/].
  It seems a bug in the test.
{code}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hdfs.MiniDFSCluster.stopDataNode(MiniDFSCluster.java:2157)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.killDatanode(TestDFSStripedOutputStreamWithFailure.java:445)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:374)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:301)
at 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:172)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9343) Empty caller context considered invalid

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982153#comment-14982153
 ] 

Hadoop QA commented on HDFS-9343:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 16s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 13s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 5s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 28s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 19s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 189m 11s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
| JDK v1.7.0_79 Failed junit tests | hadoop.metrics2.impl.TestGangliaMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982006#comment-14982006
 ] 

Zhe Zhang commented on HDFS-9236:
-

bq. If a DN has a RUR, it will return RecoveryInProgressException.
[~walter.k.su] A quick comment that DN could run an older version of HDFS than 
NN. And unknown DN bugs could violate the above assumption as well. Similar to 
what we saw on HDFS-9289.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982036#comment-14982036
 ] 

Hadoop QA commented on HDFS-9276:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 13s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
2s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 57s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 1s 
{color} | {color:red} Patch generated 1 new checkstyle issues in root (total 
was 29, now 30). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 58s 
{color} | {color:red} hadoop-common-project/hadoop-common introduced 1 new 
FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 57s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 35s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 5s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 20s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 29s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 167m 22s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common-project/hadoop-common |
|  |  org.apache.hadoop.security.token.Token$PrivateToken defines equals but 
not hashCode  At Token.java:hashCode  At Token.java:[lines 216-222] |
| JDK v1.8.0_60 Failed junit tests | hadoop.ha.TestZKFailoverController |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   |

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982059#comment-14982059
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9323:
---

I see what you mean now.  Thanks for reviewing the patch.

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982065#comment-14982065
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9323:
---

The failure of testMultipleDatanodeFailure56 is not related.  The other failed 
tests are obviously not related.

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982162#comment-14982162
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #605 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/605/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock

2015-10-30 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9289:

Summary: Make DataStreamer#block thread safe and verify genStamp in 
commitBlock  (was: check genStamp when complete file)

> Make DataStreamer#block thread safe and verify genStamp in commitBlock
> --
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch, HDFS-9289.5.patch, HDFS-9289.6.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed

2015-10-30 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981969#comment-14981969
 ] 

Xiao Chen commented on HDFS-9344:
-

Thanks for creating the JIRA Zhe. I'll be working on it. 

> QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
> -
>
> Key: HDFS-9344
> URL: https://issues.apache.org/jira/browse/HDFS-9344
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Xiao Chen
>Priority: Minor
>
> After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in 
> {{UnderReplicatedBlocks}} is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed

2015-10-30 Thread Jagadesh Kiran N (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N reassigned HDFS-9344:
--

Assignee: Jagadesh Kiran N

> QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
> -
>
> Key: HDFS-9344
> URL: https://issues.apache.org/jira/browse/HDFS-9344
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Jagadesh Kiran N
>Priority: Minor
>
> After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in 
> {{UnderReplicatedBlocks}} is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9345) Erasure Coding: create dummy coder and schema

2015-10-30 Thread Rui Li (JIRA)

Rui Li created HDFS-9345:


 Summary: Erasure Coding: create dummy coder and schema
 Key: HDFS-9345
 URL: https://issues.apache.org/jira/browse/HDFS-9345
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rui Li
Assignee: Rui Li


We  can create dummy coder which does no computation and simply returns zero 
bytes. Similarly, we can create a test-only schema with no parity blocks.
Such coder and schema can be used to isolate the performance issue to HDFS-side 
logic instead of codec, which would be useful when tuning performance of EC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-30 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981986#comment-14981986
 ] 

Surendra Singh Lilhore commented on HDFS-9229:
--

Thanks [~zhz] for reviews and commit.

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-9229.001.patch, HDFS-9229.002.patch, 
> HDFS-9229.003.patch, HDFS-9229.004.patch, HDFS-9229.005.patch, 
> HDFS-9229.006.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock

2015-10-30 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981971#comment-14981971
 ] 

Zhe Zhang commented on HDFS-9289:
-

Thanks Chang for the update. The main code change looks good. I'll take a 
closer look at the test tomorrow. I just updated the JIRA title. Please see if 
it looks OK.

> Make DataStreamer#block thread safe and verify genStamp in commitBlock
> --
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch, HDFS-9289.5.patch, HDFS-9289.6.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-30 Thread Liangliang Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangliang Gu updated HDFS-9276:

Attachment: HDFS-9276.09.patch

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
>   at 
>

[jira] [Commented] (HDFS-9342) Erasure coding: client should update and commit block based on acknowledged size

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982192#comment-14982192
 ] 

Walter Su commented on HDFS-9342:
-

Both {{updatePipeline}} and {{commitBlock}} will set the visible length of 
lastBlock.
It's ok {{commitBlock}} uses {{currentBlockGroup}}, becasue we have flushed all.
{{updatePipeline}} set too bigger visible length according to 
{{currentBlockGroup}}, of which the last part can be invisible, since it's not 
acked yet.
Thanks [~zhz] for reporting this. I think it is a issue.

> Erasure coding: client should update and commit block based on acknowledged 
> size
> 
>
> Key: HDFS-9342
> URL: https://issues.apache.org/jira/browse/HDFS-9342
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Priority: Critical
>
> For non-EC files, we have:
> {code}
> protected ExtendedBlock block; // its length is number of bytes acked
> {code}
> For EC files, the size of {{DFSStripedOutputStream#currentBlockGroup}} is 
> incremented in {{writeChunk}} without waiting for ack. And both 
> {{updatePipeline}} and {{commitBlock}} are based on size of 
> {{currentBlockGroup}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9343) Empty caller context considered invalid

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982004#comment-14982004
 ] 

Hadoop QA commented on HDFS-9343:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 0s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 59s 
{color} | {color:red} Patch generated 1 new checkstyle issues in root (total 
was 263, now 263). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 12s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 48s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 26s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 45s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 198m 16s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
| JDK v1.7.0_79 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.0 Server=1.7.0 
Image:test-patch-base-hadoop-date2015-10-30 |
|

[jira] [Updated] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-9323:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I have committed this.

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982116#comment-14982116
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8728 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8728/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982079#comment-14982079
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9323:
---

Filed HDFS-9346 testMultipleDatanodeFailure56.

> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9348) DFS GetErasureCodingPolicy API on a non-existent file should be handled properly

2015-10-30 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9348:
---
Attachment: HDFS-9348-00.patch

> DFS GetErasureCodingPolicy API on a non-existent file should be handled 
> properly
> 
>
> Key: HDFS-9348
> URL: https://issues.apache.org/jira/browse/HDFS-9348
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9348-00.patch
>
>
> Presently calling {{dfs#getErasureCodingPolicy()}} on a non-existent file is 
> returning the ErasureCodingPolicy info. As per the 
> [discussion|https://issues.apache.org/jira/browse/HDFS-8777?focusedCommentId=14981077=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14981077]
>  it has to validate and throw FileNotFoundException.
> Also, {{dfs#getEncryptionZoneForPath()}} API has the same behavior. Again we 
> can discuss to add the file existence validation in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9348) DFS GetErasureCodingPolicy API on a non-existent file should be handled properly

2015-10-30 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9348:
---
Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

> DFS GetErasureCodingPolicy API on a non-existent file should be handled 
> properly
> 
>
> Key: HDFS-9348
> URL: https://issues.apache.org/jira/browse/HDFS-9348
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9348-00.patch
>
>
> Presently calling {{dfs#getErasureCodingPolicy()}} on a non-existent file is 
> returning the ErasureCodingPolicy info. As per the 
> [discussion|https://issues.apache.org/jira/browse/HDFS-8777?focusedCommentId=14981077=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14981077]
>  it has to validate and throw FileNotFoundException.
> Also, {{dfs#getEncryptionZoneForPath()}} API has the same behavior. Again we 
> can discuss to add the file existence validation in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9254:

Attachment: HDFS-9254.03.patch

v3 patch.
# I had fixed incorrect defaults in hdfs-defaults.xml which broke some secure 
mode tests. Fixed those tests to initialize their configuration correctly.
# Remove mention of servicen...@realm.tld principals.

> HDFS Secure Mode Documentation updates
> --
>
> Key: HDFS-9254
> URL: https://issues.apache.org/jira/browse/HDFS-9254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-9254.01.patch, HDFS-9254.02.patch, 
> HDFS-9254.03.patch
>
>
> Some Kerberos configuration parameters are not documented well enough. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9329) TestBootstrapStandby#testRateThrottling is flaky because fsimage size is smaller than IO buffer size

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983476#comment-14983476
 ] 

Hadoop QA commented on HDFS-9329:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 13s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 143m 14s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
|   | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup |
|   |

[jira] [Created] (HDFS-9349) Reconfigure NN protected directories on the fly

2015-10-30 Thread Xiaobing Zhou (JIRA)

Xiaobing Zhou created HDFS-9349:
---

 Summary: Reconfigure NN protected directories on the fly
 Key: HDFS-9349
 URL: https://issues.apache.org/jira/browse/HDFS-9349
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9343) Empty caller context considered invalid

2015-10-30 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9343:

Status: Open  (was: Patch Available)

> Empty caller context considered invalid
> ---
>
> Key: HDFS-9343
> URL: https://issues.apache.org/jira/browse/HDFS-9343
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9343.000.patch, HDFS-9343.001.patch, 
> HDFS-9343.002.patch, HDFS-9343.003.patch
>
>
> The caller context with empty context string is considered invalid, and it 
> should not appear in the audit log.
> Meanwhile, too long signature will not be written to audit log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9343) Empty caller context considered invalid

2015-10-30 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9343:

Status: Patch Available  (was: Open)

> Empty caller context considered invalid
> ---
>
> Key: HDFS-9343
> URL: https://issues.apache.org/jira/browse/HDFS-9343
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9343.000.patch, HDFS-9343.001.patch, 
> HDFS-9343.002.patch, HDFS-9343.003.patch
>
>
> The caller context with empty context string is considered invalid, and it 
> should not appear in the audit log.
> Meanwhile, too long signature will not be written to audit log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9262) Reconfigure DN lazy writer interval on the fly

2015-10-30 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9262:

Summary: Reconfigure DN lazy writer interval on the fly  (was: Reconfigure 
lazy writer interval on the fly)

> Reconfigure DN lazy writer interval on the fly
> --
>
> Key: HDFS-9262
> URL: https://issues.apache.org/jira/browse/HDFS-9262
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9262.001.patch
>
>
> This is to reconfigure
> {code}
> dfs.datanode.lazywriter.interval.sec
> {code}
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9254:

Attachment: HDFS-9254.02.patch

> HDFS Secure Mode Documentation updates
> --
>
> Key: HDFS-9254
> URL: https://issues.apache.org/jira/browse/HDFS-9254
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-9254.01.patch, HDFS-9254.02.patch
>
>
> Some Kerberos configuration parameters are not documented well enough. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9330) Reconfigure DN deleting duplicate replica on the fly

2015-10-30 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9330:

Summary: Reconfigure DN deleting duplicate replica on the fly  (was: 
Reconfigure deleting duplicate replica on the fly)

> Reconfigure DN deleting duplicate replica on the fly
> 
>
> Key: HDFS-9330
> URL: https://issues.apache.org/jira/browse/HDFS-9330
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9330.001.patch
>
>
> This is to reconfigure
> {code}
> dfs.datanode.duplicate.replica.deletion
> {code}
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9349) Reconfigure NN protected directories on the fly

2015-10-30 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou reassigned HDFS-9349:
---

Assignee: Xiaobing Zhou

> Reconfigure NN protected directories on the fly
> ---
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9308) Add truncateMeta() and deleteMeta() to MiniDFSCluster

2015-10-30 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983478#comment-14983478
 ] 

Lei (Eddy) Xu commented on HDFS-9308:
-

Hi, [~twu]

Thanks for working on this.  It looks good overall. Just a few small comments:

{code}
   * @return true if the metadata was corrupted, false otherwise.
*/
public void deleteMeta(int i, ExtendedBlock blk)
{code}

Please revise the comments to reflect code change.

{code}
System.out.println("Deliberately removing meta for block " + eb);
{code}

We can use {{LOG.info}} here.

{code}
assertTrue("File length is " + newFileLen + " expecting " +
  expectedNewFileLen, newFileLen == expectedNewFileLen);
{code}

Lets use {{assertEqual}} here and make the second line indent correctly.

+1 once addressed above comments.

> Add truncateMeta() and deleteMeta() to MiniDFSCluster
> -
>
> Key: HDFS-9308
> URL: https://issues.apache.org/jira/browse/HDFS-9308
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Minor
> Attachments: HDFS-9308.001.patch, HDFS-9308.002.patch
>
>
> HDFS-9188 introduced {{corruptMeta()}} method to make corrupting the metadata 
> file filesystem agnostic. There should also be a {{truncateMeta()}} and 
> {{deleteMeta()}} method in MiniDFSCluster to allow truncation of metadata 
> files on DataNodes without writing code that's specific to underling file 
> system. {{FsDatasetTestUtils#truncateMeta()}} is already implemented by 
> HDFS-9188 and cam be exposed easily in {{MiniDFSCluster}}.
> This will be useful for tests such as 
> {{TestLeaseRecovery#testBlockRecoveryWithLessMetafile}} and 
> {{TestCrcCorruption#testCrcCorruption}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed

2015-10-30 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-9344.
-
Resolution: Invalid

Thanks Xiao! Good analysis! Closing this as invalid.

> QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
> -
>
> Key: HDFS-9344
> URL: https://issues.apache.org/jira/browse/HDFS-9344
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Xiao Chen
>Priority: Minor
>
> After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in 
> {{UnderReplicatedBlocks}} is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9341) Simulated slow disk in SimulatedFSDataset

2015-10-30 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9341:

Status: Patch Available  (was: Open)

Maybe we can dedicate this JIRA to adding random slowness. Attaching patch with 
that scope.

> Simulated slow disk in SimulatedFSDataset
> -
>
> Key: HDFS-9341
> URL: https://issues.apache.org/jira/browse/HDFS-9341
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Attachments: HDFS-9341.00.patch
>
>
> Besides simulating the byte content, {{SimulatedFSDataset}} can also simulate 
> the scenario where disk is slow when accessing certain bytes. The slowness 
> can be random or controlled at certain bytes. It can also be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9350:
--
Status: Patch Available  (was: Open)

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9350:
--
Attachment: HDFS-9350.001.patch

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg reassigned HDFS-9350:
-

Assignee: Staffan Friberg

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9103) Retry reads on DN failure

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983470#comment-14983470
 ] 

Hadoop QA commented on HDFS-9103:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
52s {color} | {color:green} HDFS-8707 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 12s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 12s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 13s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 13s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 13s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 13s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s 
{color} | {color:red} Patch generated 425 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 37s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-10-30 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12769851/HDFS-9103.HDFS-8707.006.patch
 |
| JIRA Issue | HDFS-9103 |
| Optional Tests |  asflicense  cc  unit  javac  compile  |
| uname | Linux cffe648ed7e7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh
 |
| git revision | HDFS-8707 / 85232c6 |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_60 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13305/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13305/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_79.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13305/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt
 |
| cc |

[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983528#comment-14983528
 ] 

Hadoop QA commented on HDFS-8777:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 3m 19s 
{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 31s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 41s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 142m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   |

[jira] [Updated] (HDFS-9341) Simulated slow disk in SimulatedFSDataset

2015-10-30 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9341:

Attachment: HDFS-9341.00.patch

> Simulated slow disk in SimulatedFSDataset
> -
>
> Key: HDFS-9341
> URL: https://issues.apache.org/jira/browse/HDFS-9341
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>Priority: Minor
> Attachments: HDFS-9341.00.patch
>
>
> Besides simulating the byte content, {{SimulatedFSDataset}} can also simulate 
> the scenario where disk is slow when accessing certain bytes. The slowness 
> can be random or controlled at certain bytes. It can also be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9305) Delayed heartbeat processing causes storm of subsequent heartbeats

2015-10-30 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983629#comment-14983629
 ] 

Arpit Agarwal commented on HDFS-9305:
-

That is right, thanks for updating it.

> Delayed heartbeat processing causes storm of subsequent heartbeats
> --
>
> Key: HDFS-9305
> URL: https://issues.apache.org/jira/browse/HDFS-9305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Arpit Agarwal
> Fix For: 2.7.2
>
> Attachments: HDFS-9305.01.patch, HDFS-9305.02.patch
>
>
> A DataNode typically sends a heartbeat to the NameNode every 3 seconds.  We 
> expect heartbeat handling to complete relatively quickly.  However, if 
> something unexpected causes heartbeat processing to get blocked, such as a 
> long GC or heavy lock contention within the NameNode, then heartbeat 
> processing would be delayed.  After recovering from this delay, the DataNode 
> then starts sending a storm of heartbeat messages in a tight loop.  In a 
> large cluster with many DataNodes, this storm of heartbeat messages could 
> cause harmful load on the NameNode and make overall cluster recovery more 
> difficult.
> The bug appears to be caused by incorrect timekeeping inside 
> {{BPServiceActor}}.  The next heartbeat time is always calculated as a delta 
> from the previous heartbeat time, without any compensation for possible long 
> latency on an individual heartbeat RPC.  The only mitigation would be 
> restarting all DataNodes to force a reset of the heartbeat schedule, or 
> simply wait out the storm until the scheduling catches up and corrects itself.
> This problem would not manifest after a NameNode restart.  In that case, the 
> NameNode would respond to the first heartbeat by telling the DataNode to 
> re-register, and {{BPServiceActor#reRegister}} would reset the heartbeat 
> schedule to the current time.  I believe the problem would only manifest if 
> the NameNode process kept alive, but processed heartbeats unexpectedly slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9305) Delayed heartbeat processing causes storm of subsequent heartbeats

2015-10-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9305:

Target Version/s:   (was: 2.7.2)

> Delayed heartbeat processing causes storm of subsequent heartbeats
> --
>
> Key: HDFS-9305
> URL: https://issues.apache.org/jira/browse/HDFS-9305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Arpit Agarwal
> Fix For: 2.7.2
>
> Attachments: HDFS-9305.01.patch, HDFS-9305.02.patch
>
>
> A DataNode typically sends a heartbeat to the NameNode every 3 seconds.  We 
> expect heartbeat handling to complete relatively quickly.  However, if 
> something unexpected causes heartbeat processing to get blocked, such as a 
> long GC or heavy lock contention within the NameNode, then heartbeat 
> processing would be delayed.  After recovering from this delay, the DataNode 
> then starts sending a storm of heartbeat messages in a tight loop.  In a 
> large cluster with many DataNodes, this storm of heartbeat messages could 
> cause harmful load on the NameNode and make overall cluster recovery more 
> difficult.
> The bug appears to be caused by incorrect timekeeping inside 
> {{BPServiceActor}}.  The next heartbeat time is always calculated as a delta 
> from the previous heartbeat time, without any compensation for possible long 
> latency on an individual heartbeat RPC.  The only mitigation would be 
> restarting all DataNodes to force a reset of the heartbeat schedule, or 
> simply wait out the storm until the scheduling catches up and corrects itself.
> This problem would not manifest after a NameNode restart.  In that case, the 
> NameNode would respond to the first heartbeat by telling the DataNode to 
> re-register, and {{BPServiceActor#reRegister}} would reset the heartbeat 
> schedule to the current time.  I believe the problem would only manifest if 
> the NameNode process kept alive, but processed heartbeats unexpectedly slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9305) Delayed heartbeat processing causes storm of subsequent heartbeats

2015-10-30 Thread Aaron T. Myers (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-9305:
-
Fix Version/s: 2.7.2

Setting the fix version to 2.7.2. [~arpitagarwal] - if that's not right, please 
change it appropriately.

> Delayed heartbeat processing causes storm of subsequent heartbeats
> --
>
> Key: HDFS-9305
> URL: https://issues.apache.org/jira/browse/HDFS-9305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Arpit Agarwal
> Fix For: 2.7.2
>
> Attachments: HDFS-9305.01.patch, HDFS-9305.02.patch
>
>
> A DataNode typically sends a heartbeat to the NameNode every 3 seconds.  We 
> expect heartbeat handling to complete relatively quickly.  However, if 
> something unexpected causes heartbeat processing to get blocked, such as a 
> long GC or heavy lock contention within the NameNode, then heartbeat 
> processing would be delayed.  After recovering from this delay, the DataNode 
> then starts sending a storm of heartbeat messages in a tight loop.  In a 
> large cluster with many DataNodes, this storm of heartbeat messages could 
> cause harmful load on the NameNode and make overall cluster recovery more 
> difficult.
> The bug appears to be caused by incorrect timekeeping inside 
> {{BPServiceActor}}.  The next heartbeat time is always calculated as a delta 
> from the previous heartbeat time, without any compensation for possible long 
> latency on an individual heartbeat RPC.  The only mitigation would be 
> restarting all DataNodes to force a reset of the heartbeat schedule, or 
> simply wait out the storm until the scheduling catches up and corrects itself.
> This problem would not manifest after a NameNode restart.  In that case, the 
> NameNode would respond to the first heartbeat by telling the DataNode to 
> re-register, and {{BPServiceActor#reRegister}} would reset the heartbeat 
> schedule to the current time.  I believe the problem would only manifest if 
> the NameNode process kept alive, but processed heartbeats unexpectedly slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-30 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982267#comment-14982267
 ] 

Steve Loughran commented on HDFS-9276:
--

in the test, you don't need to catch that exception and convert to a 
{{fail()}}. Remove the {{try}} block and JUnit will handle it direclty

> Failed to Update HDFS Delegation Token for long running application in HA mode
> --
>
> Key: HDFS-9276
> URL: https://issues.apache.org/jira/browse/HDFS-9276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, ha, security
>Affects Versions: 2.7.1
>Reporter: Liangliang Gu
>Assignee: Liangliang Gu
> Attachments: HDFS-9276.01.patch, HDFS-9276.02.patch, 
> HDFS-9276.03.patch, HDFS-9276.04.patch, HDFS-9276.05.patch, 
> HDFS-9276.06.patch, HDFS-9276.07.patch, HDFS-9276.08.patch, 
> HDFS-9276.09.patch, debug1.PNG, debug2.PNG
>
>
> The Scenario is as follows:
> 1. NameNode HA is enabled.
> 2. Kerberos is enabled.
> 3. HDFS Delegation Token (not Keytab or TGT) is used to communicate with 
> NameNode.
> 4. We want to update the HDFS Delegation Token for long running applicatons. 
> HDFS Client will generate private tokens for each NameNode. When we update 
> the HDFS Delegation Token, these private tokens will not be updated, which 
> will cause token expired.
> This bug can be reproduced by the following program:
> {code}
> import java.security.PrivilegedExceptionAction
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> import org.apache.hadoop.security.UserGroupInformation
> object HadoopKerberosTest {
>   def main(args: Array[String]): Unit = {
> val keytab = "/path/to/keytab/xxx.keytab"
> val principal = "x...@abc.com"
> val creds1 = new org.apache.hadoop.security.Credentials()
> val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
> ugi1.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> val fs = FileSystem.get(new Configuration())
> fs.addDelegationTokens("test", creds1)
> null
>   }
> })
> val ugi = UserGroupInformation.createRemoteUser("test")
> ugi.addCredentials(creds1)
> ugi.doAs(new PrivilegedExceptionAction[Void] {
>   // Get a copy of the credentials
>   override def run(): Void = {
> var i = 0
> while (true) {
>   val creds1 = new org.apache.hadoop.security.Credentials()
>   val ugi1 = 
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
>   ugi1.doAs(new PrivilegedExceptionAction[Void] {
> // Get a copy of the credentials
> override def run(): Void = {
>   val fs = FileSystem.get(new Configuration())
>   fs.addDelegationTokens("test", creds1)
>   null
> }
>   })
>   UserGroupInformation.getCurrentUser.addCredentials(creds1)
>   val fs = FileSystem.get( new Configuration())
>   i += 1
>   println()
>   println(i)
>   println(fs.listFiles(new Path("/user"), false))
>   Thread.sleep(60 * 1000)
> }
> null
>   }
> })
>   }
> }
> {code}
> To reproduce the bug, please set the following configuration to Name Node:
> {code}
> dfs.namenode.delegation.token.max-lifetime = 10min
> dfs.namenode.delegation.key.update-interval = 3min
> dfs.namenode.delegation.token.renew-interval = 3min
> {code}
> The bug will occure after 3 minutes.
> The stacktrace is:
> {code}
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 330156 for test) is expired
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982263#comment-14982263
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1340 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1340/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982274#comment-14982274
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #617 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/617/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt

2015-10-30 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982268#comment-14982268
 ] 

Steve Loughran commented on HDFS-9328:
--

Are you proposing a different cpp guide from the rest of the project? There's 
no per-module Java style guide after all

> Formalize coding standards for libhdfs++ and put them in a README.txt
> -
>
> Key: HDFS-9328
> URL: https://issues.apache.org/jira/browse/HDFS-9328
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Haohui Mai
>Priority: Blocker
>
> We have 2-3 people working on this project full time and hopefully more 
> people will start contributing.  In order to efficiently scale we need a 
> single, easy to find, place where developers can check to make sure they are 
> following the coding standards of this project to both save their time and 
> save the time of people doing code reviews.
> The most practical place to do this seems like a README file in libhdfspp/. 
> The foundation of the standards is google's C++ guide found here: 
> https://google-styleguide.googlecode.com/svn/trunk/cppguide.html
> Any exceptions to google's standards or additional restrictions need to be 
> explicitly enumerated so there is one single point of reference for all 
> libhdfs++ code standards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982295#comment-14982295
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #554 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/554/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-10-30 Thread Walter Su (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-9173:

Attachment: HDFS-9173.05.patch

rebased

> Erasure Coding: Lease recovery for striped file
> ---
>
> Key: HDFS-9173
> URL: https://issues.apache.org/jira/browse/HDFS-9173
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9173.00.wip.patch, HDFS-9173.01.patch, 
> HDFS-9173.02.step125.patch, HDFS-9173.03.patch, HDFS-9173.04.patch, 
> HDFS-9173.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983718#comment-14983718
 ] 

Walter Su commented on HDFS-9236:
-

I agree with [~zhz] that a buggy DN could cause this issue.

And I agree with you said. Thanks for digging into the code. My mistake. I 
revise what I said:
  -If a DN has a RUR, it will return RecoveryInProgressException.-
   If a replica is RUR, a DN returns {{RecoveryInProgressException}} or its 
original state.

The thing is, it never returns RUR. It's weird {{syncBlock()}} assume there is 
a RUR in syncList.
It's just I prefer to keep the role of {{syncBlock}} simple as the javadoc said 
"Block synchronization.".
{{recover()}} does only one thing, checking, so it's a better place. If you 
insist, let's check the arguments at the top of {{syncBlock()}}.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9352) Hadoop-Trunk test cases start failing from morning

2015-10-30 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9352:
---
Priority: Blocker  (was: Major)

> Hadoop-Trunk test cases start failing from morning
> --
>
> Key: HDFS-9352
> URL: https://issues.apache.org/jira/browse/HDFS-9352
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
>
> {noformat}
> Stack Trace:
> org.apache.hadoop.ipc.RemoteException: File /TestAbandonBlock_test_quota1 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 2 datanode(s) running and 1 node(s) are excluded in this operation.
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1736)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:299)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2457)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:796)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2305)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2301)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2301)
> {noformat}
>  https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/558/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9340) libhdfspp fails to compile after HDFS-9207

2015-10-30 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9340:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-8707
   Status: Resolved  (was: Patch Available)

I've committed the patch to the HDFS-8707 branch. Thanks for the reviews.

> libhdfspp fails to compile after HDFS-9207
> --
>
> Key: HDFS-9340
> URL: https://issues.apache.org/jira/browse/HDFS-9340
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: HDFS-8707
>
> Attachments: HDFS-9340.HDFS-8707.000.patch
>
>
> After the refactor of HDFS-9207 the {{hadoop-hdfs-client}} module fails to 
> compile as it invokes {{cmake}} against a directory that does not exist. It 
> should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9351:

Description: 
I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
one thing.

The patch changed
{code}
  if (snapshottableDirs != null) {
SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
.getSnapshottableDirListing();
if (snapshotDirs != null) {
  for (SnapshottableDirectoryStatus dir : snapshotDirs) {
snapshottableDirs.add(dir.getFullPath().toString());
  }
}
  }
{code}
to
{code}
  if (snapshottableDirs != null) {
snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
  }
{code}

In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
checkNNStartup(), however, this is not done with the new code. This is a hole 
of the earlier HDFS-9231 patch.

Create this jira to fix the hole. Suggest to revert this portion of the change, 
and add code to do what we need at NamenodeFsck.

Sorry for missing this in my earlier review of HDFS-9231.

Thanks.



  was:
I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
one thing.

The patch changed
{code}
  if (snapshottableDirs != null) {
SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
.getSnapshottableDirListing();
if (snapshotDirs != null) {
  for (SnapshottableDirectoryStatus dir : snapshotDirs) {
snapshottableDirs.add(dir.getFullPath().toString());
  }
}
  }
{code}
to
{code}
  if (snapshottableDirs != null) {
snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
  }
{code}

In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
checkNNStartup(), however, this is not done with the new code. This is a hole 
of the earlier HDFS-9231 patch.

Create this jira to fix the hole. Suggest to revert this portion of the change, 
and add code to do what we need at NamenodeFsck.

Thanks.




> No checkNNStartup() is called when fsck calls 
> FSNamesystem.getSnapshottableDirs()
> -
>
> Key: HDFS-9351
> URL: https://issues.apache.org/jira/browse/HDFS-9351
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Yongjun Zhang
>Assignee: Xiao Chen
>
> I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
> one thing.
> The patch changed
> {code}
>   if (snapshottableDirs != null) {
> SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
> .getSnapshottableDirListing();
> if (snapshotDirs != null) {
>   for (SnapshottableDirectoryStatus dir : snapshotDirs) {
> snapshottableDirs.add(dir.getFullPath().toString());
>   }
> }
>   }
> {code}
> to
> {code}
>   if (snapshottableDirs != null) {
> snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
>   }
> {code}
> In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
> checkNNStartup(), however, this is not done with the new code. This is a hole 
> of the earlier HDFS-9231 patch.
> Create this jira to fix the hole. Suggest to revert this portion of the 
> change, and add code to do what we need at NamenodeFsck.
> Sorry for missing this in my earlier review of HDFS-9231.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983695#comment-14983695
 ] 

Hadoop QA commented on HDFS-9129:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 4s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} Patch generated 5 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 809, now 757). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 33s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 41s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 178m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   |

[jira] [Commented] (HDFS-9341) Simulated slow disk in SimulatedFSDataset

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983835#comment-14983835
 ] 

Hadoop QA commented on HDFS-9341:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 386, now 388). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 5s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 42s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 181m 47s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   |

[jira] [Assigned] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang reassigned HDFS-9351:
---

Assignee: Xiao Chen

Hi Xiao, would you please look into? thanks much.


> No checkNNStartup() is called when fsck calls 
> FSNamesystem.getSnapshottableDirs()
> -
>
> Key: HDFS-9351
> URL: https://issues.apache.org/jira/browse/HDFS-9351
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Yongjun Zhang
>Assignee: Xiao Chen
>
> I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
> one thing.
> The patch changed
> {code}
>   if (snapshottableDirs != null) {
> SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
> .getSnapshottableDirListing();
> if (snapshotDirs != null) {
>   for (SnapshottableDirectoryStatus dir : snapshotDirs) {
> snapshottableDirs.add(dir.getFullPath().toString());
>   }
> }
>   }
> {code}
> to
> {code}
>   if (snapshottableDirs != null) {
> snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
>   }
> {code}
> In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
> checkNNStartup(), however, this is not done with the new code. This is a hole 
> of the earlier HDFS-9231 patch.
> Create this jira to fix the hole. Suggest to revert this portion of the 
> change, and add code to do what we need at NamenodeFsck.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9348) DFS GetErasureCodingPolicy API on a non-existent file should be handled properly

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983713#comment-14983713
 ] 

Hadoop QA commented on HDFS-9348:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 23s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project (total was 81, now 81). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 59s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 7s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 2s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 162m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   |

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983731#comment-14983731
 ] 

Walter Su commented on HDFS-9236:
-

{{syncBlock}} already has an assumption that there's no RUR in syncList. 
{{bestState}} starts with RWR. {{bestState}} can't be {{RUR}}.
{code}
  // Calculate the best available replica state.
  ReplicaState bestState = ReplicaState.RWR;
  for (BlockRecord r : syncList) {
if (rState.getValue() < bestState.getValue()) {
  bestState = rState;
}
{code}

It's weird we add one more assumption that there is a RUR in syncList.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9352) Hadoop-Trunk test cases start failing from morning

2015-10-30 Thread Brahma Reddy Battula (JIRA)

Brahma Reddy Battula created HDFS-9352:
--

 Summary: Hadoop-Trunk test cases start failing from morning
 Key: HDFS-9352
 URL: https://issues.apache.org/jira/browse/HDFS-9352
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula


{noformat}
Stack Trace:
org.apache.hadoop.ipc.RemoteException: File /TestAbandonBlock_test_quota1 could 
only be replicated to 0 nodes instead of minReplication (=1).  There are 2 
datanode(s) running and 1 node(s) are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1736)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:299)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2457)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:796)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2305)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2301)
{noformat}

 https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/558/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9254) HDFS Secure Mode Documentation updates

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983793#comment-14983793
 ] 

Hadoop QA commented on HDFS-9254:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 34 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 4s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 21s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 49s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 4s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s 
{color} | {color:red} Patch generated 58 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 181m 30s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.security.TestShellBasedIdMapping |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   |

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983828#comment-14983828
 ] 

Yongjun Zhang commented on HDFS-9236:
-

Hi [~walter.k.su],

Thanks for the comments. 

Agree that {{bestState}} won't be {{RUR}}, but seems to me that it doesn't mean 
there may not be {{RUR}} in the syncList. Especially RUR may exist as a bug 
situation as discussed, and the fix is to try to issue a good message when 
there is bug.  So we actually can not assume there is no RUR in the syncList. 
Right? Thanks.





> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983829#comment-14983829
 ] 

Hadoop QA commented on HDFS-9351:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 4s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 359, now 359). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 43s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 27s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 17s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 149m 45s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   |

[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983844#comment-14983844
 ] 

Walter Su commented on HDFS-9236:
-

I mean RUR shouldn't be put in syncList.

> Missing sanity check for block size during block recovery
> -
>
> Key: HDFS-9236
> URL: https://issues.apache.org/jira/browse/HDFS-9236
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
> Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, 
> HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch
>
>
> Ran into an issue while running test against faulty data-node code. 
> Currently in DataNode.java:
> {code:java}
>   /** Block synchronization */
>   void syncBlock(RecoveringBlock rBlock,
>  List syncList) throws IOException {
> …
> // Calculate the best available replica state.
> ReplicaState bestState = ReplicaState.RWR;
> …
> // Calculate list of nodes that will participate in the recovery
> // and the new block size
> List participatingList = new ArrayList();
> final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId,
> -1, recoveryId);
> switch(bestState) {
> …
> case RBW:
> case RWR:
>   long minLength = Long.MAX_VALUE;
>   for(BlockRecord r : syncList) {
> ReplicaState rState = r.rInfo.getOriginalReplicaState();
> if(rState == bestState) {
>   minLength = Math.min(minLength, r.rInfo.getNumBytes());
>   participatingList.add(r);
> }
>   }
>   newBlock.setNumBytes(minLength);
>   break;
> …
> }
> …
> nn.commitBlockSynchronization(block,
> newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
> datanodes, storages);
>   }
> {code}
> This code is called by the DN coordinating the block recovery. In the above 
> case, it is possible for none of the rState (reported by DNs with copies of 
> the replica being recovered) to match the bestState. This can either be 
> caused by faulty DN code or stale/modified/corrupted files on DN. When this 
> happens the DN will end up reporting the minLengh of Long.MAX_VALUE.
> Unfortunately there is no check on the NN for replica length. See 
> FSNamesystem.java:
> {code:java}
>   void commitBlockSynchronization(ExtendedBlock oldBlock,
>   long newgenerationstamp, long newlength,
>   boolean closeFile, boolean deleteblock, DatanodeID[] newtargets,
>   String[] newtargetstorages) throws IOException {
> …
>   if (deleteblock) {
> Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock);
> boolean remove = iFile.removeLastBlock(blockToDel) != null;
> if (remove) {
>   blockManager.removeBlock(storedBlock);
> }
>   } else {
> // update last block
> if(!copyTruncate) {
>   storedBlock.setGenerationStamp(newgenerationstamp);
>   
>   // XXX block length is updated without any check <<<   storedBlock.setNumBytes(newlength);
> }
> …
> if (closeFile) {
>   LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock
>   + ", file=" + src
>   + (copyTruncate ? ", newBlock=" + truncatedBlock
>   : ", newgenerationstamp=" + newgenerationstamp)
>   + ", newlength=" + newlength
>   + ", newtargets=" + Arrays.asList(newtargets) + ") successful");
> } else {
>   LOG.info("commitBlockSynchronization(" + oldBlock + ") successful");
> }
>   }
> {code}
> After this point the block length becomes Long.MAX_VALUE. Any subsequent 
> block report (even with correct length) will cause the block to be marked as 
> corrupted. Since this is block could be the last block of the file. If this 
> happens and the client goes away, NN won’t be able to recover the lease and 
> close the file because the last block is under-replicated.
> I believe we need to have a sanity check for block size on both DN and NN to 
> prevent such case from happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9351:

Attachment: HDFS-9351.patch

Thanks very much for the additional review and creating this JIRA. Good catch!
I extracted that logic to {{FSNamesystem#getSnapshottableDirs}} method 
initially, but after some revs (patch004 in HDF-9231) that method is only 
called once and I should have reverted it anyways. My apologies for not doing 
so on the initial JIRA. Attached a patch to do this.

> No checkNNStartup() is called when fsck calls 
> FSNamesystem.getSnapshottableDirs()
> -
>
> Key: HDFS-9351
> URL: https://issues.apache.org/jira/browse/HDFS-9351
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Yongjun Zhang
>Assignee: Xiao Chen
> Attachments: HDFS-9351.patch
>
>
> I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
> one thing.
> The patch changed
> {code}
>   if (snapshottableDirs != null) {
> SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
> .getSnapshottableDirListing();
> if (snapshotDirs != null) {
>   for (SnapshottableDirectoryStatus dir : snapshotDirs) {
> snapshottableDirs.add(dir.getFullPath().toString());
>   }
> }
>   }
> {code}
> to
> {code}
>   if (snapshottableDirs != null) {
> snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
>   }
> {code}
> In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
> checkNNStartup(), however, this is not done with the new code. This is a hole 
> of the earlier HDFS-9231 patch.
> Create this jira to fix the hole. Suggest to revert this portion of the 
> change, and add code to do what we need at NamenodeFsck.
> Sorry for missing this in my earlier review of HDFS-9231.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9329) TestBootstrapStandby#testRateThrottling is flaky because fsimage size is smaller than IO buffer size

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983721#comment-14983721
 ] 

Hadoop QA commented on HDFS-9329:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 19s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 43s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 28s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 179m 47s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | 
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks |
|   |

[jira] [Created] (HDFS-9353) Code and comment mismatch in JavaKeyStoreProvider

2015-10-30 Thread nijel (JIRA)

nijel created HDFS-9353:
---

 Summary: Code and comment mismatch in  JavaKeyStoreProvider 
 Key: HDFS-9353
 URL: https://issues.apache.org/jira/browse/HDFS-9353
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: nijel
Priority: Trivial


In

org.apache.hadoop.crypto.key.JavaKeyStoreProvider.JavaKeyStoreProvider(URI uri, 
Configuration conf) throws IOException

The comment mentioned is
{code}
// Get the password file from the conf, if not present from the user's
// environment var
{code}

But the code takes the value form ENV first

I think this make sense since the user can pass the ENV for a particular run.

My suggestion is to change the comment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983824#comment-14983824
 ] 

Vinayakumar B commented on HDFS-4937:
-

hi [~aw], any idea why tests did not run in last precommit for the patch here  
[above comment 
|https://issues.apache.org/jira/browse/HDFS-4937?focusedCommentId=14981649=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14981649]
?

> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Yongjun Zhang (JIRA)

Yongjun Zhang created HDFS-9351:
---

 Summary: No checkNNStartup() is called when fsck calls 
FSNamesystem.getSnapshottableDirs()
 Key: HDFS-9351
 URL: https://issues.apache.org/jira/browse/HDFS-9351
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Reporter: Yongjun Zhang


I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
one thing.

The patch changed
{code}
  if (snapshottableDirs != null) {
SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
.getSnapshottableDirListing();
if (snapshotDirs != null) {
  for (SnapshottableDirectoryStatus dir : snapshotDirs) {
snapshottableDirs.add(dir.getFullPath().toString());
  }
}
  }
{code}
to
{code}
  if (snapshottableDirs != null) {
snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
  }
{code}

In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
checkNNStartup(), however, this is not done with the new code. This is a hole 
of the earlier HDFS-9231 patch.

Create this jira to fix the hole. Suggest to revert this portion of the change, 
and add code to do what we need at NamenodeFsck.

Thanks.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9320) libhdfspp should not use sizeof for stream parsing

2015-10-30 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983669#comment-14983669
 ] 

Haohui Mai commented on HDFS-9320:
--

{code}
+const unsigned int kSizeofJavaShort = 2;
+const unsigned int kSizeofJavaInt = 4;
+
{code}

should be inlined as {{sizeof(uint16_t)}} and {{sizeof(uint32_t)}} 
respectively. Other than that it looks good to me.


> libhdfspp should not use sizeof for stream parsing
> --
>
> Key: HDFS-9320
> URL: https://issues.apache.org/jira/browse/HDFS-9320
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-9320.HDFS-8707.000.patch
>
>
> In a few places, we're using sizeof(int) and sizeof(short) to determine where 
> in the received buffers we should be looking for data.  Those values are 
> compiler- and platform-dependent.  We should use specified sizes, or at least 
> sizeof(int32_t).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9351) No checkNNStartup() is called when fsck calls FSNamesystem.getSnapshottableDirs()

2015-10-30 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9351:

Status: Patch Available  (was: Open)

> No checkNNStartup() is called when fsck calls 
> FSNamesystem.getSnapshottableDirs()
> -
>
> Key: HDFS-9351
> URL: https://issues.apache.org/jira/browse/HDFS-9351
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Reporter: Yongjun Zhang
>Assignee: Xiao Chen
> Attachments: HDFS-9351.patch
>
>
> I looked further at HDFS-9231 change I reviewed earlier, and realize I missed 
> one thing.
> The patch changed
> {code}
>   if (snapshottableDirs != null) {
> SnapshottableDirectoryStatus[] snapshotDirs = namenode.getRpcServer()
> .getSnapshottableDirListing();
> if (snapshotDirs != null) {
>   for (SnapshottableDirectoryStatus dir : snapshotDirs) {
> snapshottableDirs.add(dir.getFullPath().toString());
>   }
> }
>   }
> {code}
> to
> {code}
>   if (snapshottableDirs != null) {
> snapshottableDirs = namenode.getNamesystem().getSnapshottableDirs();
>   }
> {code}
> In old code, namenode.getRpcServer().getSnapshottableDirListing() calls 
> checkNNStartup(), however, this is not done with the new code. This is a hole 
> of the earlier HDFS-9231 patch.
> Create this jira to fix the hole. Suggest to revert this portion of the 
> change, and add code to do what we need at NamenodeFsck.
> Sorry for missing this in my earlier review of HDFS-9231.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9308) Add truncateMeta() and deleteMeta() to MiniDFSCluster

2015-10-30 Thread Tony Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Wu updated HDFS-9308:
--
Attachment: HDFS-9308.003.patch

In v3 patch:
* Addressed [~eddyxu]'s review comments.

*Please note:*
{{TestCrcCorruption#testCrcCorruption}} seems to be broken by 
43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f for HDFS-4937 (did a quick binary 
search to find this change). I will open a separate JIRA for the test failure.

Manually applied the patch to before the change and verified it works:
{code}
---
 T E S T S
---
Running org.apache.hadoop.hdfs.TestCrcCorruption
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.56 sec - in 
org.apache.hadoop.hdfs.TestCrcCorruption
Running org.apache.hadoop.hdfs.TestLeaseRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.752 sec - in 
org.apache.hadoop.hdfs.TestLeaseRecovery

Results :

Tests run: 7, Failures: 0, Errors: 0, Skipped: 0
{code}

> Add truncateMeta() and deleteMeta() to MiniDFSCluster
> -
>
> Key: HDFS-9308
> URL: https://issues.apache.org/jira/browse/HDFS-9308
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Minor
> Attachments: HDFS-9308.001.patch, HDFS-9308.002.patch, 
> HDFS-9308.003.patch
>
>
> HDFS-9188 introduced {{corruptMeta()}} method to make corrupting the metadata 
> file filesystem agnostic. There should also be a {{truncateMeta()}} and 
> {{deleteMeta()}} method in MiniDFSCluster to allow truncation of metadata 
> files on DataNodes without writing code that's specific to underling file 
> system. {{FsDatasetTestUtils#truncateMeta()}} is already implemented by 
> HDFS-9188 and cam be exposed easily in {{MiniDFSCluster}}.
> This will be useful for tests such as 
> {{TestLeaseRecovery#testBlockRecoveryWithLessMetafile}} and 
> {{TestCrcCorruption#testCrcCorruption}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9276) Failed to Update HDFS Delegation Token for long running application in HA mode

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982499#comment-14982499
 ] 

Hadoop QA commented on HDFS-9276:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 50s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
1s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 10s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 7s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 22s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 26s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 2s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 157m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.fs.contract.hdfs.TestHDFSContractConcat |
|   | hadoop.hdfs.TestRecoverStripedFile |
| JDK v1.7.0_79 Failed junit tests | hadoop.metrics2.sink.TestFileSink |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-10-30 |
| JIRA

[jira] [Commented] (HDFS-9323) Randomize the DFSStripedOutputStreamWithFailure tests

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982542#comment-14982542
 ] 

Hudson commented on HDFS-9323:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2492 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2492/])
HDFS-9323. Randomize the DFSStripedOutputStreamWithFailure tests. (szetszwo: 
rev f072eb5a206d34d8af39d65c3ef1f39faaebfdd0)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure060.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure100.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure140.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure200.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure030.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure070.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure210.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure080.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure150.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure190.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure160.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure020.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure090.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure130.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure040.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure120.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure110.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure170.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure050.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamWithFailure180.java


> Randomize the DFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-9323
> URL: https://issues.apache.org/jira/browse/HDFS-9323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h9323_20151028.patch
>
>
> Currently, the DFSStripedOutputStreamWithFailure tests are run with fixed 
> indices #0 - #19 and #59, and also a test with random index 
> (testDatanodeFailureRandomLength).  We should randomize all the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9275) Wait previous ErasureCodingWork to finish before schedule another one

2015-10-30 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982604#comment-14982604
 ] 

Walter Su commented on HDFS-9275:
-

The failed tests are not related.

> Wait previous ErasureCodingWork to finish before schedule another one
> -
>
> Key: HDFS-9275
> URL: https://issues.apache.org/jira/browse/HDFS-9275
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9275.01.patch, HDFS-9275.02.patch, 
> HDFS-9275.03.patch, HDFS-9275.04.patch, HDFS-9275.05.patch
>
>
> In {{ErasureCodingWorker}}, for the same block group, one task doesn't know 
> which internal blocks is in recovering by other tasks. We could end up with 
> recovering 2 identical block with same index. So, {{ReplicationMonitor}} 
> should wait previous work to finish before schedule another one.
> This is related to the occasional failure of {{TestRecoverStripedFile}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982643#comment-14982643
 ] 

Kihwal Lee commented on HDFS-4937:
--

Also committed to branch-2.7.

> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9320) libhdfspp should not use sizeof for stream parsing

2015-10-30 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9320:
--
Attachment: HDFS-9320.HDFS-8707.000.patch

Added constants defining the size of java ints and shorts in datatransfer.h and 
use them in the corresponding parts of the wire protocol that implicitly rely 
on java's sizes.

Changed int to int32_t in places where there was bitwise operations just to be 
safe.

> libhdfspp should not use sizeof for stream parsing
> --
>
> Key: HDFS-9320
> URL: https://issues.apache.org/jira/browse/HDFS-9320
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-9320.HDFS-8707.000.patch
>
>
> In a few places, we're using sizeof(int) and sizeof(short) to determine where 
> in the received buffers we should be looking for data.  Those values are 
> compiler- and platform-dependent.  We should use specified sizes, or at least 
> sizeof(int32_t).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9320) libhdfspp should not use sizeof for stream parsing

2015-10-30 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer reassigned HDFS-9320:
-

Assignee: James Clampffer

> libhdfspp should not use sizeof for stream parsing
> --
>
> Key: HDFS-9320
> URL: https://issues.apache.org/jira/browse/HDFS-9320
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
>
> In a few places, we're using sizeof(int) and sizeof(short) to determine where 
> in the received buffers we should be looking for data.  Those values are 
> compiler- and platform-dependent.  We should use specified sizes, or at least 
> sizeof(int32_t).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock

2015-10-30 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982617#comment-14982617
 ] 

Chang Li commented on HDFS-9289:


Thanks [~zhz], the jira title looks good to me. Looking forward to your further 
review.

> Make DataStreamer#block thread safe and verify genStamp in commitBlock
> --
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch, HDFS-9289.5.patch, HDFS-9289.6.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4937:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, [~hitliuyi]. I've committed this to trunk and branch-2.

> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9320) libhdfspp should not use sizeof for stream parsing

2015-10-30 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9320:
--
Status: Patch Available  (was: Open)

> libhdfspp should not use sizeof for stream parsing
> --
>
> Key: HDFS-9320
> URL: https://issues.apache.org/jira/browse/HDFS-9320
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-9320.HDFS-8707.000.patch
>
>
> In a few places, we're using sizeof(int) and sizeof(short) to determine where 
> in the received buffers we should be looking for data.  Those values are 
> compiler- and platform-dependent.  We should use specified sizes, or at least 
> sizeof(int32_t).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-9272) Implement a unix-like cat utility

2015-10-30 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer resolved HDFS-9272.
---
Resolution: Won't Fix

I went down the wrong path with this so I'll close it.  Will either open 
another jira to add optional host and port parameters to existing libhdfs tests 
or implement hdfsConnect("default",0) in the C API.

> Implement a unix-like cat utility
> -
>
> Key: HDFS-9272
> URL: https://issues.apache.org/jira/browse/HDFS-9272
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Minor
> Attachments: HDFS-9272.HDFS-8707.000.patch
>
>
> Implement the basic functionality of "cat" and have it build as a separate 
> executable.
> 2 Reasons for this:
> We don't have any real integration tests at the moment so something simple to 
> verify that the library actually works against a real cluster is useful.
> Eventually I'll make more utilities like stat, mkdir etc.  Once there are 
> enough of them it will be simple to make a C++ implementation of the hadoop 
> fs command line interface that doesn't take the latency hit of spinning up a 
> JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982633#comment-14982633
 ] 

Hudson commented on HDFS-4937:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8730 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8730/])
HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 
43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4937:
-
Fix Version/s: (was: 2.8.0)
   2.7.2

> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982676#comment-14982676
 ] 

Hudson commented on HDFS-4937:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2549 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2549/])
HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 
43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-1477) Make NameNode Reconfigurable.

2015-10-30 Thread Xiaobing Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983266#comment-14983266
 ] 

Xiaobing Zhou commented on HDFS-1477:
-

Thanks [~arpitagarwal] for review! The next patch will address your comments.

> Make NameNode Reconfigurable.
> -
>
> Key: HDFS-1477
> URL: https://issues.apache.org/jira/browse/HDFS-1477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Patrick Kling
>Assignee: Xiaobing Zhou
> Attachments: HDFS-1477.005.patch, HDFS-1477.2.patch, 
> HDFS-1477.3.patch, HDFS-1477.4.patch, HDFS-1477.patch
>
>
> Modify NameNode to implement the interface Reconfigurable proposed in 
> HADOOP-7001. This would allow us to change certain configuration properties 
> without restarting the name node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9348) DFS GetErasureCodingPolicy API on a non-existent file should be handled properly

2015-10-30 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983301#comment-14983301
 ] 

Uma Maheswara Rao G commented on HDFS-9348:
---

Hi Rakesh, unifying the behaviors would be a good idea. Yeah, I think ECZone 
related APIs behavior change would be a incompatible change. In my opinion, 
doing something on nonexistent file would be users unexpected situations, so 
letting user know about the file existence make sense to me. Do we also need to 
check ECPolicy setting and ECZone getting APIs?

[~andrew.wang], [~hitliuyi], could you guys also comment what is your opinion 
on this change?



> DFS GetErasureCodingPolicy API on a non-existent file should be handled 
> properly
> 
>
> Key: HDFS-9348
> URL: https://issues.apache.org/jira/browse/HDFS-9348
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Minor
> Attachments: HDFS-9348-00.patch
>
>
> Presently calling {{dfs#getErasureCodingPolicy()}} on a non-existent file is 
> returning the ErasureCodingPolicy info. As per the 
> [discussion|https://issues.apache.org/jira/browse/HDFS-8777?focusedCommentId=14981077=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14981077]
>  it has to validate and throw FileNotFoundException.
> Also, {{dfs#getEncryptionZoneForPath()}} API has the same behavior. Again we 
> can discuss to add the file existence validation in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-2746) Add a parameter to `hdfs dfsadmin -saveNamespace' to allow an arbitrary storage dir

2015-10-30 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen reassigned HDFS-2746:
---

Assignee: Xiao Chen

> Add a parameter to `hdfs dfsadmin -saveNamespace' to allow an arbitrary 
> storage dir
> ---
>
> Key: HDFS-2746
> URL: https://issues.apache.org/jira/browse/HDFS-2746
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Aaron T. Myers
>Assignee: Xiao Chen
>
> Presently `hdfs dfsadmin -saveNamespace' will attempt to save a snapshot of 
> the file system metadata state into all configured storage dirs. It would be 
> nice if we could optionally specify an arbitrary local dir on the NN to save 
> the FS state into, instead of the configured storage dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-2746) Add a parameter to `hdfs dfsadmin -saveNamespace' to allow an arbitrary storage dir

2015-10-30 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983502#comment-14983502
 ] 

Xiao Chen commented on HDFS-2746:
-

Thanks ATM and Harsh for creating this issue and clarifying. This sounds to be 
an interesting feature, and I would love to work on it.

> Add a parameter to `hdfs dfsadmin -saveNamespace' to allow an arbitrary 
> storage dir
> ---
>
> Key: HDFS-2746
> URL: https://issues.apache.org/jira/browse/HDFS-2746
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Aaron T. Myers
>
> Presently `hdfs dfsadmin -saveNamespace' will attempt to save a snapshot of 
> the file system metadata state into all configured storage dirs. It would be 
> nice if we could optionally specify an arbitrary local dir on the NN to save 
> the FS state into, instead of the configured storage dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HDFS-9140) Discover conf parameters that need no DataNode restart to make changes effective

2015-10-30 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-9140 started by Xiaobing Zhou.
---
> Discover conf parameters that need no DataNode restart to make changes 
> effective
> 
>
> Key: HDFS-9140
> URL: https://issues.apache.org/jira/browse/HDFS-9140
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> This JIRA is to find those parameters that need NN/DN restart in order to 
> make the changes, otherwise, using admin facility API to reconfigure them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982818#comment-14982818
 ] 

Hudson commented on HDFS-4937:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #619 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/619/])
HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 
43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9320) libhdfspp should not use sizeof for stream parsing

2015-10-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982719#comment-14982719
 ] 

Hadoop QA commented on HDFS-9320:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
0s {color} | {color:green} HDFS-8707 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 10s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 10s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 12s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 12s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 11s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 12s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s 
{color} | {color:red} Patch generated 425 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 4s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-10-30 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12769791/HDFS-9320.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-9320 |
| Optional Tests |  asflicense  cc  unit  javac  compile  |
| uname | Linux ef48dc4f6ab7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh
 |
| git revision | HDFS-8707 / 85232c6 |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_60 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13300/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13300/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_79.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13300/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt
 |
| cc |

[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()

2015-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982791#comment-14982791
 ] 

Hudson commented on HDFS-4937:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2493 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2493/])
HDFS-4937. ReplicationMonitor can infinite-loop in (kihwal: rev 
43539b5ff4ac0874a8a454dc93a2a782b0e0ea8f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> ReplicationMonitor can infinite-loop in 
> BlockPlacementPolicyDefault#chooseRandom()
> --
>
> Key: HDFS-4937
> URL: https://issues.apache.org/jira/browse/HDFS-4937
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.4-alpha, 0.23.8
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch
>
>
> When a large number of nodes are removed by refreshing node lists, the 
> network topology is updated. If the refresh happens at the right moment, the 
> replication monitor thread may stuck in the while loop of {{chooseRandom()}}. 
> This is because the cached cluster size is used in the terminal condition 
> check of the loop. This usually happens when a block with a high replication 
> factor is being processed. Since replicas/rack is also calculated beforehand, 
> no node choice may satisfy the goodness criteria if refreshing removed racks. 
> All nodes will end up in the excluded list, but the size will still be less 
> than the cached cluster size, so it will loop infinitely. This was observed 
> in a production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 131 matches

Mail list logo