[jira] [Comment Edited] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711204#comment-15711204
 ] 

Weiwei Yang edited comment on HDFS-11156 at 12/1/16 7:50 AM:
-

Regarding to the jenkins job result

# The findbugs error was from a recent commit HDFS-10930, not from this patch.
# The checkstyle issues were not caused by this patch either, 1st issue was the 
method too long, 2nd the code style was using nested blocks in switch-case 
block. 
# The 2 UT failures also not related, I have run them locally and they all 
succeed. They seemed to be intermittent failures,  HDFS-9599 and HDFS-11030 
might be related. 

I've also tested this with latest trunk code on a small cluster,  both
{noformat}
http://:/webhdfs/v1/tmp/test.log?op=GETFILEBLOCKLOCATIONS=100=0
{noformat}

and 
{noformat}
http://:/webhdfs/v1/tmp/test.log?op=GET_BLOCK_LOCATIONS=100=0
{noformat}

works and returned expected value. [~liuml07] Would you please help to review 
the patch?

Thanks a lot.


was (Author: cheersyang):
Regarding to the jenkins job result

# The findbugs error was from a recent commit HDFS-10930, not from this patch.
# The checkstyle issues were not caused by this patch either, 1st issue was the 
method too long, 2nd the code style was using nested blocks in switch-case 
block. 
# The 2 UT failures also not related, I have run them locally and they all 
succeed. They seemed to be intermittent failures,  HDFS-9599 and HDFS-11030 
might be related. 

Hello [~liuml07] Would you please help to review the patch ? Thanks a lot.

> Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
> --
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711204#comment-15711204
 ] 

Weiwei Yang commented on HDFS-11156:


Regarding to the jenkins job result

# The findbugs error was from a recent commit HDFS-10930, not from this patch.
# The checkstyle issues were not caused by this patch either, 1st issue was the 
method too long, 2nd the code style was using nested blocks in switch-case 
block. 
# The 2 UT failures also not related, I have run them locally and they all 
succeed. They seemed to be intermittent failures,  HDFS-9599 and HDFS-11030 
might be related. 

Hello [~liuml07] Would you please help to review the patch ? Thanks a lot.

> Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
> --
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711154#comment-15711154
 ] 

Xiaobing Zhou commented on HDFS-10913:
--

posted v006 patch.
# added separate test class TestDataNodeFaultInjector
# split the jumbo test into small tests

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch, HDFS-10913.006.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10913:
-
Attachment: HDFS-10913.006.patch

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch, HDFS-10913.006.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8630) WebHDFS : Support get/set/unset StoragePolicy

2016-11-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711140#comment-15711140
 ] 

Andrew Wang commented on HDFS-8630:
---

Hi Surendra, thanks for updating the patch, only really have nits,

* baseTestHttpFSWith, could we convert the isLocalFS early exits to instead use 
Assume.assumeFalse ? It's cleaner and more informative.
* baseTestHttpFSWith, can also use assertArrayEquals. Please also add messages 
to the asserts for easier debugging in case it fails (and as documentation).
* FSOperations, please move the JSON conversion code to static methods like the 
others at the top of the file

WebHDFS.md

* The anchor links are broken, I think incorrectly capitalized
* Although this is not related, could you add param documentation for 
"startAfter" as well? Or we can do it in a separate JIRA.
* Need to also define the StoragePolicies schema, see FileStatuses/FileStatus 
as an example

> WebHDFS : Support get/set/unset StoragePolicy 
> --
>
> Key: HDFS-8630
> URL: https://issues.apache.org/jira/browse/HDFS-8630
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: webhdfs
>Reporter: nijel
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-8630.001.patch, HDFS-8630.002.patch, 
> HDFS-8630.003.patch, HDFS-8630.004.patch, HDFS-8630.005.patch, 
> HDFS-8630.006.patch, HDFS-8630.007.patch, HDFS-8630.008.patch, 
> HDFS-8630.009.patch, HDFS-8630.patch
>
>
> User can set and get the storage policy from filesystem object. Same 
> operation can be allowed trough REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11195) When appending files by webhdfs rest api fails, it returns 200

2016-11-30 Thread Yuanbo Liu (JIRA)
Yuanbo Liu created HDFS-11195:
-

 Summary: When appending files by webhdfs rest api fails, it 
returns 200
 Key: HDFS-11195
 URL: https://issues.apache.org/jira/browse/HDFS-11195
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yuanbo Liu
Assignee: Yuanbo Liu


Suppose that there is a Hadoop cluster contains only one datanode, and 
dfs.replication=3. Run:
{code}
curl -i -X POST -T  
"http://:/webhdfs/v1/?op=APPEND"
{code}
it returns 200, even though append operation fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1571#comment-1571
 ] 

Hadoop QA commented on HDFS-11156:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
42s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} hadoop-hdfs-project: The patch generated 2 new + 
237 unchanged - 0 fixed = 239 total (was 237) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
1s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11156 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841220/HDFS-11156.05.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0d7e5a7f263a 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1f7613b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17725/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 

[jira] [Updated] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-11-30 Thread Jingcheng Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HDFS-9668:
---
Attachment: HDFS-9668-26.patch

Upload a new patch V26 to address [~arpitagarwal]'s comments except the ones 
with questions. Thanks.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, 
> HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, 
> HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, 
> HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-18.patch, 
> HDFS-9668-19.patch, HDFS-9668-19.patch, HDFS-9668-2.patch, 
> HDFS-9668-20.patch, HDFS-9668-21.patch, HDFS-9668-22.patch, 
> HDFS-9668-23.patch, HDFS-9668-23.patch, HDFS-9668-24.patch, 
> HDFS-9668-25.patch, HDFS-9668-26.patch, HDFS-9668-3.patch, HDFS-9668-4.patch, 
> HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, HDFS-9668-8.patch, 
> HDFS-9668-9.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase 

[jira] [Comment Edited] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711016#comment-15711016
 ] 

Brahma Reddy Battula edited comment on HDFS-11178 at 12/1/16 6:03 AM:
--

[~linyiqun] thanks for updating patch.LGTM..{{findbugs}} related to 
HDFS-10930,I already commented there.

will commit this weekend if there is no comments from [~liuml07] Or somebody.


was (Author: brahmareddy):
[~linyiqun] thanks for updating patch.LGTM..will commit this weekend if there 
is no comments from [~liuml07] Or somebody.

> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11178.001.patch, HDFS-11178.002.patch, 
> HDFS-11178.003.patch
>
>
> The test {{TestAddStripedBlockInFBR#testAddBlockInFullBlockReport}} fails 
> easily in trunk. It's easy to reproduce the failure, it fails 2~3 times when 
> I run the test 4~-5 times in my local. Also it failed in the recent 
> Jenkins(https://builds.apache.org/job/PreCommit-HDFS-Build/17667/testReport/),
>  The stack infos:
> {code}
> java.lang.AssertionError: expected:<9> but was:<7>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR.testAddBlockInFullBlockReport(TestAddStripedBlockInFBR.java:108)
> {code}
> It's easy to have a fix: Use {{GenericTestUtils.waitFor}} to wait the full 
> blocks being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711016#comment-15711016
 ] 

Brahma Reddy Battula commented on HDFS-11178:
-

[~linyiqun] thanks for updating patch.LGTM..will commit this weekend if there 
is no comments from [~liuml07] Or somebody.

> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11178.001.patch, HDFS-11178.002.patch, 
> HDFS-11178.003.patch
>
>
> The test {{TestAddStripedBlockInFBR#testAddBlockInFullBlockReport}} fails 
> easily in trunk. It's easy to reproduce the failure, it fails 2~3 times when 
> I run the test 4~-5 times in my local. Also it failed in the recent 
> Jenkins(https://builds.apache.org/job/PreCommit-HDFS-Build/17667/testReport/),
>  The stack infos:
> {code}
> java.lang.AssertionError: expected:<9> but was:<7>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR.testAddBlockInFullBlockReport(TestAddStripedBlockInFBR.java:108)
> {code}
> It's easy to have a fix: Use {{GenericTestUtils.waitFor}} to wait the full 
> blocks being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10885) [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on

2016-11-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710987#comment-15710987
 ] 

Rakesh R commented on HDFS-10885:
-

bq. On SPS side, the check is done directly in the SPS daemon thread, and 
currently, the startup stage is not a time-consuming procedure.
I agree, now the simultaneous ops between {{Mover}} and {{SPS daemon}} window 
is very narrow. Adding another point to consider, we have plans to dynamically 
enable/disable SPS without making NN restart, please refer HDFS-11123. IMHO, 
this has to be addressed during that case.

[~umamaheswararao], how about we go ahead and commit this jira patch. Later, 
will revisit this part once the dynamic ON/OFF logic is completed?

> [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier 
> is on
> --
>
> Key: HDFS-10885
> URL: https://issues.apache.org/jira/browse/HDFS-10885
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: HDFS-10285
>
> Attachments: HDFS-10800-HDFS-10885-00.patch, 
> HDFS-10800-HDFS-10885-01.patch, HDFS-10800-HDFS-10885-02.patch, 
> HDFS-10885-HDFS-10285.03.patch, HDFS-10885-HDFS-10285.04.patch, 
> HDFS-10885-HDFS-10285.05.patch, HDFS-10885-HDFS-10285.06.patch, 
> HDFS-10885-HDFS-10285.07.patch, HDFS-10885-HDFS-10285.08.patch, 
> HDFS-10885-HDFS-10285.09.patch
>
>
> These two can not work at the same time to avoid conflicts and fight with 
> each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710977#comment-15710977
 ] 

Hadoop QA commented on HDFS-11178:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
45s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m 
37s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11178 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841214/HDFS-11178.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d348f7d12624 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1f7613b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17724/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17724/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17724/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> 

[jira] [Comment Edited] (HDFS-11072) Add ability to unset and change directory EC policy

2016-11-30 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710741#comment-15710741
 ] 

SammiChen edited comment on HDFS-11072 at 12/1/16 4:51 AM:
---

The Firebug warning is not relative to this patch. It is triggered by 
"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) " which has an update for HADOOP-10930. I will notify the owner of 
HADOOP-10930 for this issue.  

commit aeecfa24f4fb6af289920cbf8830c394e66bd78e
Author: Xiaoyu Yao 
Date:   Tue Nov 29 20:52:36 2016 -0800

HADOOP-10930. Refactor: Wrap Datanode IO related operations. Contributed by
Xiaoyu Yao.


was (Author: sammi):
The Firebug warning is not relative to this patch. It is triggered by 
"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) " which has an update for HADOOP-10930. I will notify the owner of 
HADOOP-10930 for this issue.  

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API

2016-11-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11156:
---
Attachment: HDFS-11156.05.patch

> Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
> --
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11180) Intermittent deadlock in NameNode when failover happens.

2016-11-30 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-11180:
-
Attachment: HDFS-11180-branch-2.01.patch

Rebased for branch-2.
I'll commit 04 patch to trunk tonight JST if there is no objection.

> Intermittent deadlock in NameNode when failover happens.
> 
>
> Key: HDFS-11180
> URL: https://issues.apache.org/jira/browse/HDFS-11180
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Abhishek Modi
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: high-availability
> Attachments: HDFS-11180-branch-2.01.patch, HDFS-11180.00.patch, 
> HDFS-11180.01.patch, HDFS-11180.02.patch, HDFS-11180.03.patch, 
> HDFS-11180.04.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710834#comment-15710834
 ] 

Hadoop QA commented on HDFS-11160:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 186 unchanged - 1 fixed = 189 total (was 187) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 40s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 90m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
| Timed out junit tests | org.apache.hadoop.hdfs.TestReplication |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11160 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841206/HDFS-11160.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 825ce1667713 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1f7613b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17723/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17723/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17723/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17723/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-11184) Ozone: SCM: Make SCM use container protocol

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710820#comment-15710820
 ] 

Hadoop QA commented on HDFS-11184:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
55s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
20s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
10s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 86 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
41s{color} | {color:green} hadoop-hdfs-project generated 0 new + 119 unchanged 
- 1 fixed = 119 total (was 120) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 10 
unchanged - 2 fixed = 10 total (was 12) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs generated 0 new + 9 
unchanged - 1 fixed = 9 total (was 10) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
3s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
24s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.web.TestOzoneVolumes |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.ozone.web.client.TestKeys |
|   | hadoop.hdfs.tools.TestDelegationTokenFetcher |
|   | hadoop.ozone.container.common.TestDatanodeStateMachine |
|   | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker |
|   | hadoop.ozone.web.TestOzoneRestWithMiniCluster |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | HDFS-11184 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy

2016-11-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710818#comment-15710818
 ] 

Rakesh R commented on HDFS-11072:
-

Yes, we can ignore the findbug warning, its related to [HDFS-10930, please see 
the discussion in that 
jira|https://issues.apache.org/jira/browse/HDFS-10930?focusedCommentId=15710712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15710712]

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11178:
-
Attachment: HDFS-11178.003.patch

Thanks [~brahmareddy] for the review. The comment make sense for me. New patch 
attached.

> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11178.001.patch, HDFS-11178.002.patch, 
> HDFS-11178.003.patch
>
>
> The test {{TestAddStripedBlockInFBR#testAddBlockInFullBlockReport}} fails 
> easily in trunk. It's easy to reproduce the failure, it fails 2~3 times when 
> I run the test 4~-5 times in my local. Also it failed in the recent 
> Jenkins(https://builds.apache.org/job/PreCommit-HDFS-Build/17667/testReport/),
>  The stack infos:
> {code}
> java.lang.AssertionError: expected:<9> but was:<7>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR.testAddBlockInFullBlockReport(TestAddStripedBlockInFBR.java:108)
> {code}
> It's easy to have a fix: Use {{GenericTestUtils.waitFor}} to wait the full 
> blocks being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy

2016-11-30 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710741#comment-15710741
 ] 

SammiChen commented on HDFS-11072:
--

The Firebug warning is not relative to this patch. It is triggered by 
"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) " which has an update for HADOOP-10930. I will notify the owner of 
HADOOP-10930 for this issue.  

> Add ability to unset and change directory EC policy
> ---
>
> Key: HDFS-11072
> URL: https://issues.apache.org/jira/browse/HDFS-11072
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, 
> HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, 
> it makes sense to make it more similar to storage policies and allow changing 
> and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-11-30 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710729#comment-15710729
 ] 

Wei-Chiu Chuang commented on HDFS-11160:


Hi Yongjun. Thanks for pointing out. I should have made myself more clear. In 
your example, say FinalizedReplica S0 is converted from an Rbw S(-1), I meant 
to let FinalizedReplica (S0) preserve the in-memory LPCC from RBW at S(-1).

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang

2016-11-30 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710722#comment-15710722
 ] 

Kihwal Lee commented on HDFS-11192:
---

Was it hitting its ulimit on # of threads/processes?  If that is the case, 
other bad things can happen even if this stage doesn't cause any trouble. E.g. 
replication queue init will be done by starting up a separate thread. If that 
doesn't happen, the user will be in a bigger trouble. A potential improvement 
will be to terminate NN if this happens.

> OOM during Quota Initialization lead to Namenode hang
> -
>
> Key: HDFS-11192
> URL: https://issues.apache.org/jira/browse/HDFS-11192
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or 
> not able to create,it will not notify the parent.Parent still waiting for the 
> notify call..that's not timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
> "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
> native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710712#comment-15710712
 ] 

Brahma Reddy Battula edited comment on HDFS-10930 at 12/1/16 3:22 AM:
--

 *Regarding Trunk commit.* 
1) Seems mistakenly given jiraID,should be HDFS-10930..?
{noformat}HADOOP-10930. Refactor: Wrap Datanode IO related operations. 
Contributed by Xiaoyu Yao.{noformat}

2) can you look following  findbug?
{noformat}
FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) may fail to clean up java.io.InputStream on checked exception Obligation 
to clean up resource created at BlockPoolSlice.java:clean up 
java.io.InputStream on checked exception Obligation to clean up resource 
created at BlockPoolSlice.java:[line 720] is not discharged 
{noformat}




was (Author: brahmareddy):
 *Regarding Trunk commit.* 
1) Seems commit message contains wrong jiraID,it should be HDFS-10930
{noformat}HADOOP-10930. Refactor: Wrap Datanode IO related operations. 
Contributed by Xiaoyu Yao.{noformat}

2) can you look following  findbug?
{noformat}
FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) may fail to clean up java.io.InputStream on checked exception Obligation 
to clean up resource created at BlockPoolSlice.java:clean up 
java.io.InputStream on checked exception Obligation to clean up resource 
created at BlockPoolSlice.java:[line 720] is not discharged 
{noformat}



> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, 
> HDFS-10930-branch-2.001.patch, HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710712#comment-15710712
 ] 

Brahma Reddy Battula commented on HDFS-10930:
-

 *Regarding Trunk commit.* 
1) Seems commit message contains wrong jiraID,it should be HDFS-10930
{noformat}HADOOP-10930. Refactor: Wrap Datanode IO related operations. 
Contributed by Xiaoyu Yao.{noformat}

2) can you look following  findbug?
{noformat}
FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.validateIntegrityAndSetLength(File,
 long) may fail to clean up java.io.InputStream on checked exception Obligation 
to clean up resource created at BlockPoolSlice.java:clean up 
java.io.InputStream on checked exception Obligation to clean up resource 
created at BlockPoolSlice.java:[line 720] is not discharged 
{noformat}



> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, 
> HDFS-10930-branch-2.001.patch, HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-11-30 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710709#comment-15710709
 ] 

Wei-Chiu Chuang commented on HDFS-11160:


Thanks Yongjun. Much appreciate your help here. I looked at v003 patch quickly 
and looks reasonable to me. Agree with you, there's no need to load last chunk 
checksum if the last chunk is full.

> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-11-30 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710657#comment-15710657
 ] 

Yongjun Zhang commented on HDFS-11160:
--

About your optimization:
{quote}
The basic idea is to copy the in-memory last partial chunk checksum in RBW 
replica when converting an RBW to Finalized,
{quote}
I think for the bug reported here, what happened is

FinalizedReplica (S0)-->  RBW   (Append, S1)  --> FinalizedReplica (S2)

The BlockSender constructor happens at S0, then an append happens, and go 
through S1, S2, at S2, it updated the partial checksum on disk. Then 
BlockSender starts reading the data and transfer data, and got an matching 
checksum.

So I think your above optimization doesn't help this jira.

What do you think?

Thanks.


> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Comment: was deleted

(was: [~wutak...@amazon.com]
Thanks for your comments, The patch is ready, and i think the failure tests are 
not related to this patch.
 Actually, This bugfix has run on our production environment over half a year 
and the exception does not appear.)

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10453:
---
Comment: was deleted

(was: [~wutak...@amazon.com]
Thanks for your comments, The patch is ready, and i think the failure tests are 
not related to this patch.
 Actually, This bugfix has run on our production environment over half a year 
and the exception does not appear.)

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread Taklon Stephen Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710653#comment-15710653
 ] 

Taklon Stephen Wu commented on HDFS-10453:
--

+1, could someone please review this patch again?

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710645#comment-15710645
 ] 

He Xiaoqiao commented on HDFS-10453:


[~wutak...@amazon.com]
Thanks for your comments, The patch is ready, and i think the failure tests are 
not related to this patch.
 Actually, This bugfix has run on our production environment over half a year 
and the exception does not appear.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly

2016-11-30 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11160:
-
Attachment: HDFS-11160.003.patch

Hi [~jojochuang],

Thanks for your work here. I did a review of your patch here.

While the optimization discussion is still ongoing, I focused on the 
implementation. I think it's not good to let BlockSender be aware of 
FsVolumeImpl, because it seems an abstraction violation here.

I changed the implementation to address this and uploaded patch rev 003. 
Basically I think we can have a similar API in FinalizedReplica as in RBW 
replica to get the last partial checksum. 

A possible optimization is not to do this when the visibleLength is at chunk 
boundary (I have not added this change).

I did not go through the test code yet.

Please take a look at what I changed, hope it makes sense to you.

Thanks.


> VolumeScanner reports write-in-progress replicas as corrupt incorrectly
> ---
>
> Key: HDFS-11160
> URL: https://issues.apache.org/jira/browse/HDFS-11160
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: CDH5.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, 
> HDFS-11160.003.patch, HDFS-11160.reproduce.patch
>
>
> Due to a race condition initially reported in HDFS-6804, VolumeScanner may 
> erroneously detect good replicas as corrupt. This is serious because in some 
> cases it results in data loss if all replicas are declared corrupt. This bug 
> is especially prominent when there are a lot of append requests via 
> HttpFs/WebHDFS.
> We are investigating an incidence that caused very high block corruption rate 
> in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. 
> However, after applying HDFS-11056, we are still seeing VolumeScanner 
> reporting corrupt replicas.
> It turns out that if a replica is being appended while VolumeScanner is 
> scanning it, VolumeScanner may use the new checksum to compare against old 
> data, causing checksum mismatch.
> I have a unit test to reproduce the error. Will attach later. A quick and 
> simple fix is to hold FsDatasetImpl lock and read from disk the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710641#comment-15710641
 ] 

He Xiaoqiao commented on HDFS-10453:


[~wutak...@amazon.com]
Thanks for your comments, The patch is ready, and i think the failure tests are 
not related to this patch.
 Actually, This bugfix has run on our production environment over half a year 
and the exception does not appear.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread He Xiaoqiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710643#comment-15710643
 ] 

He Xiaoqiao commented on HDFS-10453:


[~wutak...@amazon.com]
Thanks for your comments, The patch is ready, and i think the failure tests are 
not related to this patch.
 Actually, This bugfix has run on our production environment over half a year 
and the exception does not appear.

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710596#comment-15710596
 ] 

Brahma Reddy Battula commented on HDFS-11192:
-

[~kihwal]/[~xyao]/[~daryn] any thoughts on this,as you worked on HDFS-8865?

> OOM during Quota Initialization lead to Namenode hang
> -
>
> Key: HDFS-11192
> URL: https://issues.apache.org/jira/browse/HDFS-11192
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or 
> not able to create,it will not notify the parent.Parent still waiting for the 
> notify call..that's not timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
> "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
> native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11184) Ozone: SCM: Make SCM use container protocol

2016-11-30 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11184:

Attachment: HDFS-11184-HDFS-7240.005.patch

Patch v5 : misc cleanup.

> Ozone: SCM: Make SCM use container protocol
> ---
>
> Key: HDFS-11184
> URL: https://issues.apache.org/jira/browse/HDFS-11184
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-11184-HDFS-7240.001.patch, 
> HDFS-11184-HDFS-7240.002.patch, HDFS-11184-HDFS-7240.003.patch, 
> HDFS-11184-HDFS-7240.004.patch, HDFS-11184-HDFS-7240.005.patch
>
>
> SCM will start using container protocol to communicate with datanodes. 
> This change introduces some test failures due to some missing features which 
> will be moved to KSM. Will file separate JIRA to track disabled ozone tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710576#comment-15710576
 ] 

Brahma Reddy Battula commented on HDFS-11178:
-

[~linyiqun] thanks for updating patch and reminding me.

I am thinking little cleaner way like following such that we can avoid {{if 
(groupSize == nr.liveReplicas())}} 
{code}
public Boolean get() {
try {
  // trigger dn's FBR. The FBR will add block-dn mapping.
  cluster.triggerBlockReports();

  // make sure NN has correct block-dn mapping
  BlockInfoStriped blockInfo = (BlockInfoStriped) cluster
  .getNamesystem().getFSDirectory().getINode(ecFile.toString())
  .asFile().getLastBlock();
  NumberReplicas nr = spy.countNodes(blockInfo);
  return nr.excessReplicas() == 0 && nr.liveReplicas() == groupSize;
} catch (Exception ignored) {
  // Ignore the exception
  return false;
}
  }
}, 3000, 6);
  }
{code}

we can remove {{import org.junit.Assert;}} after above modification.

> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11178.001.patch, HDFS-11178.002.patch
>
>
> The test {{TestAddStripedBlockInFBR#testAddBlockInFullBlockReport}} fails 
> easily in trunk. It's easy to reproduce the failure, it fails 2~3 times when 
> I run the test 4~-5 times in my local. Also it failed in the recent 
> Jenkins(https://builds.apache.org/job/PreCommit-HDFS-Build/17667/testReport/),
>  The stack infos:
> {code}
> java.lang.AssertionError: expected:<9> but was:<7>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR.testAddBlockInFullBlockReport(TestAddStripedBlockInFBR.java:108)
> {code}
> It's easy to have a fix: Use {{GenericTestUtils.waitFor}} to wait the full 
> blocks being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11170) Add create API in filesystem public class to support assign parameter through builder

2016-11-30 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710559#comment-15710559
 ] 

SammiChen commented on HDFS-11170:
--

THanks [~andrew.wang] for provide such good suggestion!  Will follow the 
throughts .

> Add create API in filesystem public class to support assign parameter through 
> builder
> -
>
> Key: HDFS-11170
> URL: https://issues.apache.org/jira/browse/HDFS-11170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
>
> FileSystem class supports multiple create functions to help user create file. 
> Some create functions has many parameters, it's hard for user to exactly 
> remember these parameters and their orders. This task is to add builder  
> based create functions to help user more easily create file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11192) OOM during Quota Initialization lead to Namenode hang

2016-11-30 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-11192:

Description: 
AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or not 
able to create,it will not notify the parent.Parent still waiting for the 
notify call..that's not timed waiting also.

 *Trace from Namenode log* 


{noformat}
Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
"ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at 
java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
at 
java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
at 
java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at 
java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
at 
java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
at 
java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
{noformat}

  was:
 *Trace from Namenode log* 


{noformat}
Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
"ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at 
java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
at 
java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
at 
java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at 
java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
at 
java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
at 
java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
{noformat}


> OOM during Quota Initialization lead to Namenode hang
> -
>
> Key: HDFS-11192
> URL: https://issues.apache.org/jira/browse/HDFS-11192
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: namenodeThreadDump.out
>
>
> AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or 
> not able to create,it will not notify the parent.Parent still waiting for the 
> notify call..that's not timed waiting also.
>  *Trace from Namenode log* 
> {noformat}
> Exception in thread "ForkJoinPool-1-worker-2" Exception in thread 
> "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new 
> native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
> at 
> java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
> at 
> java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609)
> at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11170) Add create API in filesystem public class to support assign parameter through builder

2016-11-30 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-11170:
-
Assignee: Wei Zhou  (was: SammiChen)

> Add create API in filesystem public class to support assign parameter through 
> builder
> -
>
> Key: HDFS-11170
> URL: https://issues.apache.org/jira/browse/HDFS-11170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
>
> FileSystem class supports multiple create functions to help user create file. 
> Some create functions has many parameters, it's hard for user to exactly 
> remember these parameters and their orders. This task is to add builder  
> based create functions to help user more easily create file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11180) Intermittent deadlock in NameNode when failover happens.

2016-11-30 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-11180:
-
Release Note:   (was: 
After this fix, the value of the following metrics/JMX can be incorrect.
* LastWrittenTransactionId (FSNameSystem)
* TotalSyncCount (FSNameSystem)
* TransactionsSinceLastLogRoll (FSNameSystem)
* TransactionsSinceLastCheckpoint (FSNameSystem)
* NameJournalStatus (NameNodeMXBean)
* JournalTransactionInfo (NameNodeMXBean)

This is due to the removal of holding lock of FSEditLog when acquiring the 
value of the metrics/JMX.)

> Intermittent deadlock in NameNode when failover happens.
> 
>
> Key: HDFS-11180
> URL: https://issues.apache.org/jira/browse/HDFS-11180
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Abhishek Modi
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: high-availability
> Attachments: HDFS-11180.00.patch, HDFS-11180.01.patch, 
> HDFS-11180.02.patch, HDFS-11180.03.patch, HDFS-11180.04.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11182) Update DataNode to use DatasetVolumeChecker

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710528#comment-15710528
 ] 

Hadoop QA commented on HDFS-11182:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
48s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 27 new + 447 unchanged - 13 fixed = 474 total (was 460) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
| Timed out junit tests | org.apache.hadoop.hdfs.server.datanode.TestDiskError |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11182 |
| GITHUB PR | https://github.com/apache/hadoop/pull/168 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 180afbaf1a82 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 69fb70c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| 

[jira] [Commented] (HDFS-11180) Intermittent deadlock in NameNode when failover happens.

2016-11-30 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710527#comment-15710527
 ] 

Akira Ajisaka commented on HDFS-11180:
--

Thanks [~kihwal] for reviewing my patch.
bq. For most metrics uses, if not all, the values won't be practically no more 
"inaccurate" than before.
Agreed. Removing the release note.

> Intermittent deadlock in NameNode when failover happens.
> 
>
> Key: HDFS-11180
> URL: https://issues.apache.org/jira/browse/HDFS-11180
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Abhishek Modi
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: high-availability
> Attachments: HDFS-11180.00.patch, HDFS-11180.01.patch, 
> HDFS-11180.02.patch, HDFS-11180.03.patch, HDFS-11180.04.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11178) TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in trunk

2016-11-30 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710503#comment-15710503
 ] 

Yiqun Lin commented on HDFS-11178:
--

Hi [~brahmareddy], I think we can hold on the commit these two days in case 
there are some further comments. What do you think?

> TestAddStripedBlockInFBR#testAddBlockInFullBlockReport fails frequently in 
> trunk
> 
>
> Key: HDFS-11178
> URL: https://issues.apache.org/jira/browse/HDFS-11178
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11178.001.patch, HDFS-11178.002.patch
>
>
> The test {{TestAddStripedBlockInFBR#testAddBlockInFullBlockReport}} fails 
> easily in trunk. It's easy to reproduce the failure, it fails 2~3 times when 
> I run the test 4~-5 times in my local. Also it failed in the recent 
> Jenkins(https://builds.apache.org/job/PreCommit-HDFS-Build/17667/testReport/),
>  The stack infos:
> {code}
> java.lang.AssertionError: expected:<9> but was:<7>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR.testAddBlockInFullBlockReport(TestAddStripedBlockInFBR.java:108)
> {code}
> It's easy to have a fix: Use {{GenericTestUtils.waitFor}} to wait the full 
> blocks being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11156) Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API

2016-11-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710484#comment-15710484
 ] 

Andrew Wang commented on HDFS-11156:


bq. this API returns BlockLocations[] + FileStatus, it has more information but 
also means more work for clients to parse and more stuff on network. 
getFileBlockLocations API should be good in most cases.

I'm guessing your client already has code to parse a FileStatus, so is this 
really more work? This depends on your usecase, but often the client will want 
a FileStatus too.

Anyway, I'm okay with a GETFILEBLOCKLOCATIONS API if that's what fits best. 
Thanks for working on this [~cheersyang]!

> Webhdfs rest api GET_BLOCK_LOCATIONS output doesn't comply with FileSystem API
> --
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11112) Journal Nodes should refuse to format non-empty directories

2016-11-30 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710466#comment-15710466
 ] 

Yiqun Lin commented on HDFS-2:
--

HI [~arpitagarwal], would you have a quick look for the latest patch, I see 
this the state of  this JIRA has not been updated for many days. Thanks.

> Journal Nodes should refuse to format non-empty directories
> ---
>
> Key: HDFS-2
> URL: https://issues.apache.org/jira/browse/HDFS-2
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Yiqun Lin
> Attachments: HDFS-2.001.patch, HDFS-2.002.patch
>
>
> Journal Nodes should reject the {{format}} RPC request if a storage directory 
> is non-empty. The relevant code is in {{JNStorage#format}}.
> {code}
>   void format(NamespaceInfo nsInfo) throws IOException {
> setStorageInfo(nsInfo);
> ...
> unlockAll();
> sd.clearDirectory();
> writeProperties(sd);
> createPaxosDir();
> analyzeStorage();
> {code}
> This would make the behavior similar to {{namenode -format -nonInteractive}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Description: The metrics collected in HDFS-10917 should be reported to and 
aggregated on NameNode as part of heart beat messages. This will make is easy 
to expose it through JMX to users who are interested in them.  (was: These 
metrics should be reported to NameNode as part of heart beat messages. In 
addition, NameNode should maintain the corresponding metrics but in a way of 
aggregation, and therefore friendly consumable by DataNode self-healing actions 
afterwards.)

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> The metrics collected in HDFS-10917 should be reported to and aggregated on 
> NameNode as part of heart beat messages. This will make is easy to expose it 
> through JMX to users who are interested in them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710456#comment-15710456
 ] 

Arpit Agarwal commented on HDFS-10930:
--

Sigh.. I am not sure how to get Jenkins to stop looking at the pull request. We 
may have to run unit tests locally.

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, 
> HDFS-10930-branch-2.001.patch, HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710452#comment-15710452
 ] 

Hadoop QA commented on HDFS-10930:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-10930 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10930 |
| GITHUB PR | https://github.com/apache/hadoop/pull/160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17721/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, 
> HDFS-10930-branch-2.001.patch, HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10930:
--
Attachment: HDFS-10930-branch-2.001.patch

Rename to trigger Jenkins for branch-2.

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, 
> HDFS-10930-branch-2.001.patch, HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Summary: Maintain aggregated peer performance metrics on NameNode  (was: 
Report performance metrics to )

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> These metrics should be reported to NameNode as part of heart beat messages. 
> In addition, NameNode should maintain the corresponding metrics but in a way 
> of aggregation, and therefore friendly consumable by DataNode self-healing 
> actions afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Report performance metrics to

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Summary: Report performance metrics to   (was: Maintain aggregated peer 
performance metrics on NameNode)

> Report performance metrics to 
> --
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> These metrics should be reported to NameNode as part of heart beat messages. 
> In addition, NameNode should maintain the corresponding metrics but in a way 
> of aggregation, and therefore friendly consumable by DataNode self-healing 
> actions afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Description: These metrics should be reported to NameNode as part of heart 
beat messages. In addition, NameNode should maintain the corresponding metrics 
but in a way of aggregation, and therefore friendly consumable by DataNode 
self-healing actions afterwards.  (was: HDFS-10917 added various peer 
performance metrics on individual DataNode. These metrics should be reported to 
NameNode as part of heart beat messages. In addition, NameNode should maintain 
the corresponding metrics but in a way of aggregation, and therefore friendly 
consumable by DataNode self-healing actions afterwards.)

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> These metrics should be reported to NameNode as part of heart beat messages. 
> In addition, NameNode should maintain the corresponding metrics but in a way 
> of aggregation, and therefore friendly consumable by DataNode self-healing 
> actions afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10917) Collect peer performance statistics on DataNode.

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10917:
-
Description: DataNodes already detect if replication pipeline operations 
are slow and log warnings. For the purpose of analysis, performance metrics are 
desirable. This proposes adding them on DataNodes.   (was: DataNodes already 
detect if replication pipeline operations are slow and log warnings. We can 
further e)

> Collect peer performance statistics on DataNode.
> 
>
> Key: HDFS-10917
> URL: https://issues.apache.org/jira/browse/HDFS-10917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> DataNodes already detect if replication pipeline operations are slow and log 
> warnings. For the purpose of analysis, performance metrics are desirable. 
> This proposes adding them on DataNodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10917) Collect peer performance statistics on DataNode.

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10917:
-
Description: DataNodes already detect if replication pipeline operations 
are slow and log warnings. We can further e  (was: DataNodes already detect if 
replication pipeline operations are slow and log warnings. For the purpose of 
analysis, performance metrics are desirable. This proposes adding them on 
DataNodes. These performance related statistics can be aggregated on the 
DataNode.)

> Collect peer performance statistics on DataNode.
> 
>
> Key: HDFS-10917
> URL: https://issues.apache.org/jira/browse/HDFS-10917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> DataNodes already detect if replication pipeline operations are slow and log 
> warnings. We can further e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10917) Collect peer performance statistics on DataNode.

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10917:
-
Summary: Collect peer performance statistics on DataNode.  (was: Maintain 
various peer performance statistics on DataNode)

> Collect peer performance statistics on DataNode.
> 
>
> Key: HDFS-10917
> URL: https://issues.apache.org/jira/browse/HDFS-10917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> DataNodes already detect if replication pipeline operations are slow and log 
> warnings. For the purpose of analysis, performance metrics are desirable. 
> This proposes adding them on DataNodes. These performance related statistics 
> can be aggregated on the DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10917) Maintain various peer performance statistics on DataNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10917:
-
Description: DataNodes already detect if replication pipeline operations 
are slow and log warnings. For the purpose of analysis, performance metrics are 
desirable. This proposes adding them on DataNodes. These performance related 
statistics can be aggregated on the DataNode.  (was: DataNodes already detect 
if replication pipeline operations are slow and log warnings. For the purpose 
of analysis, performance metrics are desirable. This proposes adding them on 
DataNodes. These performance related statistics can be aggregated and reported 
to NameNode as part of heartbeat message.)

> Maintain various peer performance statistics on DataNode
> 
>
> Key: HDFS-10917
> URL: https://issues.apache.org/jira/browse/HDFS-10917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> DataNodes already detect if replication pipeline operations are slow and log 
> warnings. For the purpose of analysis, performance metrics are desirable. 
> This proposes adding them on DataNodes. These performance related statistics 
> can be aggregated on the DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5517) Lower the default maximum number of blocks per file

2016-11-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710421#comment-15710421
 ] 

Hudson commented on HDFS-5517:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10921 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10921/])
HDFS-5517. Lower the default maximum number of blocks per file. (wang: rev 
7226a71b1f684f562bd88ee121f1dd7aa8b73816)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java


> Lower the default maximum number of blocks per file
> ---
>
> Key: HDFS-5517
> URL: https://issues.apache.org/jira/browse/HDFS-5517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-5517.002.patch, HDFS-5517.003.patch, HDFS-5517.patch
>
>
> We introduced the maximum number of blocks per file in HDFS-4305, but we set 
> the default to 1MM. In practice this limit is so high as to never be hit, 
> whereas we know that an individual file with 10s of thousands of blocks can 
> cause problems. We should lower the default value, in my opinion to 10k.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710418#comment-15710418
 ] 

Hadoop QA commented on HDFS-10930:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-10930 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10930 |
| GITHUB PR | https://github.com/apache/hadoop/pull/160 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17720/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, HDFS-10930.01.patch, 
> HDFS-10930.02.patch, HDFS-10930.03.patch, HDFS-10930.04.patch, 
> HDFS-10930.05.patch, HDFS-10930.06.patch, HDFS-10930.07.patch, 
> HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2016-11-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710408#comment-15710408
 ] 

Andrew Wang commented on HDFS-11096:


Thanks for the comment Ming, inline:

bq. To have comprehensive coverage, can we use static analysis tool to do cross 
reference between branch 2 to trunk's source code, or JACC can handle that?

I think this is indeed handled by JACC. If you want to look at a sample report, 
I set up a nightly job that runs JACC, e.g. 
https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-JACC/19/artifact/target/compat-check/report.html

We could really use some way of whitelisting issues, but I haven't figured that 
out yet.

bq. maybe there is a way to test it on a dev machine,

Yea, the difficulty is building and maintaining this test framework. Perhaps 
Bigtop can help? I think HBase has some integration tests using Docker also.

bq. For prioritization, we can also evaluate the impact or there is any work 
around. For example, we don't have to verify 2.7 -> 3.0 if we know 2.7 -> 2.8 
and 2.8 -> 3.0 work.

Another idea to limit scope is to require a minimum 2.x version for 2.x-3.x 
upgrades. HDFS-11096 proposes a min version of 2.1.0-beta, but we could 
increase this to 2.7.1 or something instead.

bq. Rolling upgrade is important for high SLA data ingestion or maybe hbase 
scenario. Without that, we have to fail over the entire cluster during upgrade.

Got it. In terms of priority I think this still ranks below API, ABI, and 
client/server wire compatibility, but it sounds like we still need it for 
beta/GA.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710386#comment-15710386
 ] 

Arpit Agarwal commented on HDFS-10930:
--

+1 for the branch-2 patch, pending Jenkins. Thanks [~xyao].

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, HDFS-10930.01.patch, 
> HDFS-10930.02.patch, HDFS-10930.03.patch, HDFS-10930.04.patch, 
> HDFS-10930.05.patch, HDFS-10930.06.patch, HDFS-10930.07.patch, 
> HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Component/s: namenode

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> HDFS-10917 added various peer performance metrics on individual DataNode. 
> These metrics should be reported to NameNode as part of heart beat messages. 
> In addition, NameNode should maintain the corresponding metrics but in a way 
> of aggregation, and therefore friendly consumable by DataNode self-healing 
> actions afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HDFS-11194:


 Summary: Maintain aggregated peer performance metrics on NameNode
 Key: HDFS-11194
 URL: https://issues.apache.org/jira/browse/HDFS-11194
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou


HDFS-10917 added various peer performance metrics on individual DataNode. These 
metrics should be reported to NameNode as part of heart beat messages. In 
addition, NameNode should maintain the corresponding metrics but in a way of 
aggregation, and therefore friendly consumable by DataNode self-healing actions 
afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11194) Maintain aggregated peer performance metrics on NameNode

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-11194:
-
Affects Version/s: 2.8.0

> Maintain aggregated peer performance metrics on NameNode
> 
>
> Key: HDFS-11194
> URL: https://issues.apache.org/jira/browse/HDFS-11194
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> HDFS-10917 added various peer performance metrics on individual DataNode. 
> These metrics should be reported to NameNode as part of heart beat messages. 
> In addition, NameNode should maintain the corresponding metrics but in a way 
> of aggregation, and therefore friendly consumable by DataNode self-healing 
> actions afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-10930:
--

Reopening for branch-2 Jenkins run.

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, HDFS-10930.01.patch, 
> HDFS-10930.02.patch, HDFS-10930.03.patch, HDFS-10930.04.patch, 
> HDFS-10930.05.patch, HDFS-10930.06.patch, HDFS-10930.07.patch, 
> HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10930:
-
Status: Patch Available  (was: Reopened)

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, HDFS-10930.01.patch, 
> HDFS-10930.02.patch, HDFS-10930.03.patch, HDFS-10930.04.patch, 
> HDFS-10930.05.patch, HDFS-10930.06.patch, HDFS-10930.07.patch, 
> HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10930:
-
Attachment: HDFS-10930-branch-2.00.patch

Rename Xiaoyu's patch for Jenkins.

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930-branch-2.00.patch, HDFS-10930.01.patch, 
> HDFS-10930.02.patch, HDFS-10930.03.patch, HDFS-10930.04.patch, 
> HDFS-10930.05.patch, HDFS-10930.06.patch, HDFS-10930.07.patch, 
> HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11189) WebHDFS and FollowRedirects

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710288#comment-15710288
 ] 

Weiwei Yang commented on HDFS-11189:


Hi [~guadilla]

It is not clear what was the problem. Webhdfs was designed not to automatically 
follow redirects, you don't want to send too many data to NN server directly, 
instead you want NN tells you which DN you want to write, then use another 
request to DN to transfer data. Would you provide more info about how webhdfs 
was crashed? At this point, can't tell this is a webhdfs problem or not.

> WebHDFS and FollowRedirects
> ---
>
> Key: HDFS-11189
> URL: https://issues.apache.org/jira/browse/HDFS-11189
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.3
> Environment: We are using webhdfs from a j2ee environment with java 
> 6. In the web application we have a lot of libraries: spring, jersey, 
> jackson...
>Reporter: Oscar Guadilla
>
> In some cases with simple operations (get file - non two step operation) we 
> detect that FollowRedirects flag of http comes as "false" and it makes 
> webhdfs crash.
> We don't really know which library change this behavior and it is really 
> strange because the change is done directly in the jvm because in other 
> wars/ears of the same j2ee server it fails also. If we restart the j2ee 
> server it starts working again.
> To fix the problem we changed the WebHdfsFileSystem class adding 
> "setInstanceFollowRedirects(true)" in the connection management instead of 
> supposing that it should be true and it works fine.
> The same problem arises both in 1.x and 2.x webhdfs. We didn't test in 3.x
> Could you fix it? We did it in our environment but it would fine in it could 
> be included in the next releases.
> Thanks in advance,
> Oscar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11188) Change min supported DN and NN versions back to 2.x

2016-11-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710276#comment-15710276
 ] 

Andrew Wang commented on HDFS-11188:


Checkstyle looks ignorable, ready for review.

> Change min supported DN and NN versions back to 2.x
> ---
>
> Key: HDFS-11188
> URL: https://issues.apache.org/jira/browse/HDFS-11188
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Critical
> Attachments: HDFS-11188.001.patch
>
>
> This is the inverse of HDFS-10398 and HADOOP-13142. Currently, trunk requires 
> a software DN and NN version of 3.0.0-alpha1. This means we cannot perform a 
> rolling upgrade from 2.x to 3.x.
> The first step towards supporting rolling upgrade is changing these back to a 
> 2.x version. For reference, branch-2 has these versions set to "2.1.0-beta".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5517) Lower the default maximum number of blocks per file

2016-11-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5517:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha2
   Status: Resolved  (was: Patch Available)

Committed, thanks for the review Akira and sorry again for missing the broken 
tests earlier.

> Lower the default maximum number of blocks per file
> ---
>
> Key: HDFS-5517
> URL: https://issues.apache.org/jira/browse/HDFS-5517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-5517.002.patch, HDFS-5517.003.patch, HDFS-5517.patch
>
>
> We introduced the maximum number of blocks per file in HDFS-4305, but we set 
> the default to 1MM. In practice this limit is so high as to never be hit, 
> whereas we know that an individual file with 10s of thousands of blocks can 
> cause problems. We should lower the default value, in my opinion to 10k.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710258#comment-15710258
 ] 

Arpit Agarwal commented on HDFS-10913:
--

The change looks good. I am reviewing the unit tests.

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10177) a wrong number of the bytes read are copied into native buffer in hdfsRead()

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710230#comment-15710230
 ] 

Hadoop QA commented on HDFS-10177:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10177 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841147/HDFS-10177.002.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 209bc5719ffe 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 69fb70c |
| Default Java | 1.8.0_111 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17718/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17718/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> a wrong number of the bytes read are copied into native buffer in hdfsRead()
> 
>
> Key: HDFS-10177
> URL: https://issues.apache.org/jira/browse/HDFS-10177
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.6.0, 2.7.2, 2.6.2, 2.6.3, 2.6.4
> Environment: RHEL 6.3, 64-bit
> Oralce JDK 1.7.0_55
>Reporter: Yongtao Yang
>Assignee: Andres Perez
>  Labels: newbie
> Attachments: HDFS-10177.002.patch, HDFS-10177.patch
>
>
> in the implementation of {{hdfsRead()}},
> {code:title=hdfsRead.c}
> tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
> {
> ..
> jint noReadBytes = length;
> ..
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {

[jira] [Comment Edited] (HDFS-11191) In the command “hdfs dfsadmin -report” The Configured Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file sys

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710206#comment-15710206
 ] 

Weiwei Yang edited comment on HDFS-11191 at 11/30/16 11:35 PM:
---

Hello [~dchan...@hortonworks.com]

Are you working on this one? I am assigning this to myself to investigate, but 
if you already started working on this one, feel free to assign it back :).


was (Author: cheersyang):
Hello [~dchan...@hortonworks.com]

Are you working on this one? I am assigning this to myself, but if you already 
started working on this one, feel free to assign it back :).

> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> ---
>
> Key: HDFS-11191
> URL: https://issues.apache.org/jira/browse/HDFS-11191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.5.0
> Environment: SLES 11SP3
> HDP 2.5.0
>Reporter: Deepak Chander
>Assignee: Weiwei Yang
>
> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> hdfs@kimtest1:~> hdfs dfsadmin -report
> Configured Capacity: 239942369274 (223.46 GB)
> Present Capacity: 207894724602 (193.62 GB)
> DFS Remaining: 207894552570 (193.62 GB)
> DFS Used: 172032 (168 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9528000512 (8.87 GB)
> DFS Remaining: 70452731902 (65.61 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 13010952192 (12.12 GB)
> DFS Remaining: 66969780222 (62.37 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.79.86:50010 (kimtest2)
> Hostname: kimtest2
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9508691968 (8.86 GB)
> DFS Remaining: 70472040446 (65.63 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> If you see my datanode root file system size its only 38GB
> kimtest3:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> kimtest4:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  4.2G   32G  12% /
> kimtest2:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> The below is from hdfs-site.xml file 
> 
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn
>   
> I have removed the other directory grid1 and restarted datanode process.
>   
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn
>   
> Now the size is reflecting correctly
> hdfs@kimtest1:/grid> hdfs dfsadmin -report
> Configured Capacity: 119971184637 (111.73 GB)
> Present Capacity: 103947243517 (96.81 GB)
> DFS Remaining: 103947157501 (96.81 GB)
> DFS Used: 86016 (84 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 4764057600 (4.44 GB)
> DFS Remaining: 35226308607 (32.81 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 

[jira] [Commented] (HDFS-11191) In the command “hdfs dfsadmin -report” The Configured Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710206#comment-15710206
 ] 

Weiwei Yang commented on HDFS-11191:


Hello [~dchan...@hortonworks.com]

Are you working on this one? I am assigning this to myself, but if you already 
started working on this one, feel free to assign it back :).

> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> ---
>
> Key: HDFS-11191
> URL: https://issues.apache.org/jira/browse/HDFS-11191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.5.0
> Environment: SLES 11SP3
> HDP 2.5.0
>Reporter: Deepak Chander
>Assignee: Weiwei Yang
>
> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> hdfs@kimtest1:~> hdfs dfsadmin -report
> Configured Capacity: 239942369274 (223.46 GB)
> Present Capacity: 207894724602 (193.62 GB)
> DFS Remaining: 207894552570 (193.62 GB)
> DFS Used: 172032 (168 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9528000512 (8.87 GB)
> DFS Remaining: 70452731902 (65.61 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 13010952192 (12.12 GB)
> DFS Remaining: 66969780222 (62.37 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.79.86:50010 (kimtest2)
> Hostname: kimtest2
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9508691968 (8.86 GB)
> DFS Remaining: 70472040446 (65.63 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> If you see my datanode root file system size its only 38GB
> kimtest3:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> kimtest4:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  4.2G   32G  12% /
> kimtest2:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> The below is from hdfs-site.xml file 
> 
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn
>   
> I have removed the other directory grid1 and restarted datanode process.
>   
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn
>   
> Now the size is reflecting correctly
> hdfs@kimtest1:/grid> hdfs dfsadmin -report
> Configured Capacity: 119971184637 (111.73 GB)
> Present Capacity: 103947243517 (96.81 GB)
> DFS Remaining: 103947157501 (96.81 GB)
> DFS Used: 86016 (84 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 4764057600 (4.44 GB)
> DFS Remaining: 35226308607 (32.81 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 07:34:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 

[jira] [Assigned] (HDFS-11191) In the command “hdfs dfsadmin -report” The Configured Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.

2016-11-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned HDFS-11191:
--

Assignee: Weiwei Yang

> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> ---
>
> Key: HDFS-11191
> URL: https://issues.apache.org/jira/browse/HDFS-11191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.5.0
> Environment: SLES 11SP3
> HDP 2.5.0
>Reporter: Deepak Chander
>Assignee: Weiwei Yang
>
> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> hdfs@kimtest1:~> hdfs dfsadmin -report
> Configured Capacity: 239942369274 (223.46 GB)
> Present Capacity: 207894724602 (193.62 GB)
> DFS Remaining: 207894552570 (193.62 GB)
> DFS Used: 172032 (168 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9528000512 (8.87 GB)
> DFS Remaining: 70452731902 (65.61 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 13010952192 (12.12 GB)
> DFS Remaining: 66969780222 (62.37 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.79.86:50010 (kimtest2)
> Hostname: kimtest2
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9508691968 (8.86 GB)
> DFS Remaining: 70472040446 (65.63 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> If you see my datanode root file system size its only 38GB
> kimtest3:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> kimtest4:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  4.2G   32G  12% /
> kimtest2:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> The below is from hdfs-site.xml file 
> 
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn
>   
> I have removed the other directory grid1 and restarted datanode process.
>   
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn
>   
> Now the size is reflecting correctly
> hdfs@kimtest1:/grid> hdfs dfsadmin -report
> Configured Capacity: 119971184637 (111.73 GB)
> Present Capacity: 103947243517 (96.81 GB)
> DFS Remaining: 103947157501 (96.81 GB)
> DFS Used: 86016 (84 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 4764057600 (4.44 GB)
> DFS Remaining: 35226308607 (32.81 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 07:34:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 6505525248 (6.06 GB)
> DFS Remaining: 33484840959 (31.19 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> 

[jira] [Commented] (HDFS-11191) In the command “hdfs dfsadmin -report” The Configured Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.

2016-11-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710201#comment-15710201
 ] 

Weiwei Yang commented on HDFS-11191:


At present the datanode capacity is a simple cumulative of each data volume 
capacity, so even 2 volumes are from same file system, it gets calculated 
twice. To fix this, o.a.h.fs.DF already provides to method to retrieve the file 
system, but we don't want to include that in each storage report in the 
heartbeat because it is a heavy unix call. I can take a look at this see if 
there is a neat way to fix this.

> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> ---
>
> Key: HDFS-11191
> URL: https://issues.apache.org/jira/browse/HDFS-11191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.5.0
> Environment: SLES 11SP3
> HDP 2.5.0
>Reporter: Deepak Chander
>
> In the command “hdfs dfsadmin -report” The Configured Capacity is misleading 
> if the dfs.datanode.data.dir is configured with two directories from the same 
> file system.
> hdfs@kimtest1:~> hdfs dfsadmin -report
> Configured Capacity: 239942369274 (223.46 GB)
> Present Capacity: 207894724602 (193.62 GB)
> DFS Remaining: 207894552570 (193.62 GB)
> DFS Used: 172032 (168 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9528000512 (8.87 GB)
> DFS Remaining: 70452731902 (65.61 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.80.38:50010 (kimtest4)
> Hostname: kimtest4
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 13010952192 (12.12 GB)
> DFS Remaining: 66969780222 (62.37 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 83.73%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> Name: 172.26.79.86:50010 (kimtest2)
> Hostname: kimtest2
> Decommission Status : Normal
> Configured Capacity: 79980789758 (74.49 GB)
> DFS Used: 57344 (56 KB)
> Non DFS Used: 9508691968 (8.86 GB)
> DFS Remaining: 70472040446 (65.63 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2
> Last contact: Tue Nov 29 06:59:02 PST 2016
> If you see my datanode root file system size its only 38GB
> kimtest3:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> kimtest4:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  4.2G   32G  12% /
> kimtest2:~ # df -h /
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/system-root   38G  2.6G   33G   8% /
> The below is from hdfs-site.xml file 
> 
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn
>   
> I have removed the other directory grid1 and restarted datanode process.
>   
> dfs.datanode.data.dir
> file:///grid/hadoop/hdfs/dn
>   
> Now the size is reflecting correctly
> hdfs@kimtest1:/grid> hdfs dfsadmin -report
> Configured Capacity: 119971184637 (111.73 GB)
> Present Capacity: 103947243517 (96.81 GB)
> DFS Remaining: 103947157501 (96.81 GB)
> DFS Used: 86016 (84 KB)
> DFS Used%: 0.00%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> Missing blocks (with replication factor 1): 0
> -
> Live datanodes (3):
> Name: 172.26.79.87:50010 (kimtest3)
> Hostname: kimtest3
> Decommission Status : Normal
> Configured Capacity: 39990394879 (37.24 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 4764057600 (4.44 GB)
> DFS Remaining: 35226308607 (32.81 GB)
> DFS Used%: 0.00%
> DFS Remaining%: 88.09%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 2

[jira] [Updated] (HDFS-10930) Refactor: Wrap Datanode IO related operations

2016-11-30 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10930:
--
Attachment: HDFS-10930.barnch-2.00.patch

Attach an initial branch-2 patch for review.

> Refactor: Wrap Datanode IO related operations
> -
>
> Key: HDFS-10930
> URL: https://issues.apache.org/jira/browse/HDFS-10930
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10930.01.patch, HDFS-10930.02.patch, 
> HDFS-10930.03.patch, HDFS-10930.04.patch, HDFS-10930.05.patch, 
> HDFS-10930.06.patch, HDFS-10930.07.patch, HDFS-10930.barnch-2.00.patch
>
>
> Datanode IO (Disk/Network) related operations and instrumentations are 
> currently spilled in many classes such as DataNode.java, BlockReceiver.java, 
> BlockSender.java, FsDatasetImpl.java, FsVolumeImpl.java, 
> DirectoryScanner.java, BlockScanner.java, FsDatasetAsyncDiskService.java, 
> LocalReplica.java, LocalReplicaPipeline.java, Storage.java, etc. 
> This ticket is opened to consolidate IO related operations for easy 
> instrumentation, metrics collection, logging and trouble shooting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8674) Improve performance of postponed block scans

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710112#comment-15710112
 ] 

Hadoop QA commented on HDFS-8674:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
49s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 119 unchanged - 2 fixed = 119 total (was 121) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker |
|   | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
|   | hadoop.hdfs.server.datanode.TestDataNodeMXBean |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-8674 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841132/HDFS-8674.trunk.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9ceb14463035 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 

[jira] [Commented] (HDFS-11184) Ozone: SCM: Make SCM use container protocol

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710085#comment-15710085
 ] 

Hadoop QA commented on HDFS-11184:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 4s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 86 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} hadoop-hdfs-project generated 0 new + 119 unchanged 
- 1 fixed = 119 total (was 120) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 9 
unchanged - 2 fixed = 9 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs generated 0 new + 9 
unchanged - 1 fixed = 9 total (was 10) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
59s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.container.common.TestDatanodeStateMachine |
|   | hadoop.ozone.web.TestOzoneVolumes |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.tools.TestDelegationTokenFetcher |
|   | hadoop.ozone.container.common.impl.TestContainerPersistence |
|   | hadoop.ozone.web.client.TestKeys |
|   | hadoop.ozone.web.TestOzoneWebAccess |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
|   | hadoop.ozone.scm.TestContainerSmallFiles |
|   | hadoop.ozone.web.client.TestBuckets |
\\
\\
|| 

[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709995#comment-15709995
 ] 

Hadoop QA commented on HDFS-10453:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
38s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 38s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.TestBlockStoragePolicy |
| JDK v1.7.0_121 Failed junit tests | 
hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HDFS-10453 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12806883/HDFS-10453-branch-2.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6d0e25f9ff3c 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 

[jira] [Updated] (HDFS-10177) a wrong number of the bytes read are copied into native buffer in hdfsRead()

2016-11-30 Thread Andres Perez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-10177:

Attachment: HDFS-10177.002.patch

New patch removing the unused variable of {{noReadBytes}}

> a wrong number of the bytes read are copied into native buffer in hdfsRead()
> 
>
> Key: HDFS-10177
> URL: https://issues.apache.org/jira/browse/HDFS-10177
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.6.0, 2.7.2, 2.6.2, 2.6.3, 2.6.4
> Environment: RHEL 6.3, 64-bit
> Oralce JDK 1.7.0_55
>Reporter: Yongtao Yang
>Assignee: Andres Perez
>  Labels: newbie
> Attachments: HDFS-10177.002.patch, HDFS-10177.patch
>
>
> in the implementation of {{hdfsRead()}},
> {code:title=hdfsRead.c}
> tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
> {
> ..
> jint noReadBytes = length;
> ..
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The problem is, {{noReadBytes}} is initialized to {{length}}, but it should 
> be set {{jVal.i}} (in the last line) before the bytes are copied to the 
> native {{buffer}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10177) a wrong number of the bytes read are copied into native buffer in hdfsRead()

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709922#comment-15709922
 ] 

Hadoop QA commented on HDFS-10177:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m 14s{color} | 
{color:red} hadoop-hdfs-project_hadoop-hdfs-native-client generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
44s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10177 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841139/HDFS-10177.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 8cadaaa68dfb 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 69fb70c |
| Default Java | 1.8.0_111 |
| cc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17717/artifact/patchprocess/diff-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17717/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17717/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> a wrong number of the bytes read are copied into native buffer in hdfsRead()
> 
>
> Key: HDFS-10177
> URL: https://issues.apache.org/jira/browse/HDFS-10177
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.6.0, 2.7.2, 2.6.2, 2.6.3, 2.6.4
> Environment: RHEL 6.3, 64-bit
> Oralce JDK 1.7.0_55
>Reporter: Yongtao Yang
>Assignee: Andres Perez
>  Labels: newbie
> Attachments: HDFS-10177.patch
>
>
> in the implementation of {{hdfsRead()}},
> {code:title=hdfsRead.c}
> tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
> {
> ..
> jint noReadBytes = length;
> ..
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
>

[jira] [Updated] (HDFS-10177) a wrong number of the bytes read are copied into native buffer in hdfsRead()

2016-11-30 Thread Andres Perez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-10177:

Attachment: HDFS-10177.patch

Simple fix as suggested

> a wrong number of the bytes read are copied into native buffer in hdfsRead()
> 
>
> Key: HDFS-10177
> URL: https://issues.apache.org/jira/browse/HDFS-10177
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.6.0, 2.7.2, 2.6.2, 2.6.3, 2.6.4
> Environment: RHEL 6.3, 64-bit
> Oralce JDK 1.7.0_55
>Reporter: Yongtao Yang
>Assignee: Andres Perez
>  Labels: newbie
> Attachments: HDFS-10177.patch
>
>
> in the implementation of {{hdfsRead()}},
> {code:title=hdfsRead.c}
> tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
> {
> ..
> jint noReadBytes = length;
> ..
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The problem is, {{noReadBytes}} is initialized to {{length}}, but it should 
> be set {{jVal.i}} (in the last line) before the bytes are copied to the 
> native {{buffer}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10177) a wrong number of the bytes read are copied into native buffer in hdfsRead()

2016-11-30 Thread Andres Perez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres Perez updated HDFS-10177:

Assignee: Andres Perez
  Status: Patch Available  (was: Open)

> a wrong number of the bytes read are copied into native buffer in hdfsRead()
> 
>
> Key: HDFS-10177
> URL: https://issues.apache.org/jira/browse/HDFS-10177
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.6.4, 2.6.3, 2.6.2, 2.7.2, 2.6.0
> Environment: RHEL 6.3, 64-bit
> Oralce JDK 1.7.0_55
>Reporter: Yongtao Yang
>Assignee: Andres Perez
>  Labels: newbie
>
> in the implementation of {{hdfsRead()}},
> {code:title=hdfsRead.c}
> tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
> {
> ..
> jint noReadBytes = length;
> ..
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The problem is, {{noReadBytes}} is initialized to {{length}}, but it should 
> be set {{jVal.i}} (in the last line) before the bytes are copied to the 
> native {{buffer}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8674) Improve performance of postponed block scans

2016-11-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-8674:
--
Attachment: HDFS-8674.trunk.patch

fix conflicts on trunk stemming from under replicated now being low redundancy.

> Improve performance of postponed block scans
> 
>
> Key: HDFS-8674
> URL: https://issues.apache.org/jira/browse/HDFS-8674
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-8674.2.patch, HDFS-8674.patch, HDFS-8674.patch, 
> HDFS-8674.trunk.patch
>
>
> When a standby goes active, it marks all nodes as "stale" which will cause 
> block invalidations for over-replicated blocks to be queued until full block 
> reports are received from the nodes with the block.  The replication monitor 
> scans the queue with O(N) runtime.  It picks a random offset and iterates 
> through the set to randomize blocks scanned.
> The result is devastating when a cluster loses multiple nodes during a 
> rolling upgrade. Re-replication occurs, the nodes come back, the excess block 
> invalidations are postponed. Rescanning just 2k blocks out of millions of 
> postponed blocks may take multiple seconds. During the scan, the write lock 
> is held which stalls all other processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709767#comment-15709767
 ] 

Hadoop QA commented on HDFS-10913:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
43s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 70 unchanged - 1 fixed = 73 total (was 71) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10913 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841090/HDFS-10913.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b74d6a632ce4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / be5a757 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17713/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17713/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17713/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17713/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17713/console |
| Powered by | Apache Yetus 

[jira] [Updated] (HDFS-11184) Ozone: SCM: Make SCM use container protocol

2016-11-30 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11184:

Attachment: HDFS-11184-HDFS-7240.004.patch

Patch v4:
 * Added one more test case to make sure that end to end use cases work with 
the new SCM.
 * Changed the timeout on MiniOzoneCluster since all tests pass locally.

> Ozone: SCM: Make SCM use container protocol
> ---
>
> Key: HDFS-11184
> URL: https://issues.apache.org/jira/browse/HDFS-11184
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-11184-HDFS-7240.001.patch, 
> HDFS-11184-HDFS-7240.002.patch, HDFS-11184-HDFS-7240.003.patch, 
> HDFS-11184-HDFS-7240.004.patch
>
>
> SCM will start using container protocol to communicate with datanodes. 
> This change introduces some test failures due to some missing features which 
> will be moved to KSM. Will file separate JIRA to track disabled ozone tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8674) Improve performance of postponed block scans

2016-11-30 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709744#comment-15709744
 ] 

Kihwal Lee commented on HDFS-8674:
--

Looks like {{HDFS-8674.2.patch}} is for branch-2.  Can you post a trunk version?

> Improve performance of postponed block scans
> 
>
> Key: HDFS-8674
> URL: https://issues.apache.org/jira/browse/HDFS-8674
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-8674.2.patch, HDFS-8674.patch, HDFS-8674.patch
>
>
> When a standby goes active, it marks all nodes as "stale" which will cause 
> block invalidations for over-replicated blocks to be queued until full block 
> reports are received from the nodes with the block.  The replication monitor 
> scans the queue with O(N) runtime.  It picks a random offset and iterates 
> through the set to randomize blocks scanned.
> The result is devastating when a cluster loses multiple nodes during a 
> rolling upgrade. Re-replication occurs, the nodes come back, the excess block 
> invalidations are postponed. Rescanning just 2k blocks out of millions of 
> postponed blocks may take multiple seconds. During the scan, the write lock 
> is held which stalls all other processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8674) Improve performance of postponed block scans

2016-11-30 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-8674:
-
Target Version/s: 2.8.0, 3.0.0-alpha2  (was: 2.8.0, 2.7.4, 3.0.0-alpha2)

> Improve performance of postponed block scans
> 
>
> Key: HDFS-8674
> URL: https://issues.apache.org/jira/browse/HDFS-8674
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-8674.2.patch, HDFS-8674.patch, HDFS-8674.patch
>
>
> When a standby goes active, it marks all nodes as "stale" which will cause 
> block invalidations for over-replicated blocks to be queued until full block 
> reports are received from the nodes with the block.  The replication monitor 
> scans the queue with O(N) runtime.  It picks a random offset and iterates 
> through the set to randomize blocks scanned.
> The result is devastating when a cluster loses multiple nodes during a 
> rolling upgrade. Re-replication occurs, the nodes come back, the excess block 
> invalidations are postponed. Rescanning just 2k blocks out of millions of 
> postponed blocks may take multiple seconds. During the scan, the write lock 
> is held which stalls all other processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11180) Intermittent deadlock in NameNode when failover happens.

2016-11-30 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709732#comment-15709732
 ] 

Kihwal Lee commented on HDFS-11180:
---

+1 for the patch.

I am not sure whether it is worth mentioning the correctness of the value in 
the release note.  For most metrics uses, if not all, the values won't be 
practically no more "inaccurate" than before.

> Intermittent deadlock in NameNode when failover happens.
> 
>
> Key: HDFS-11180
> URL: https://issues.apache.org/jira/browse/HDFS-11180
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Abhishek Modi
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: high-availability
> Attachments: HDFS-11180.00.patch, HDFS-11180.01.patch, 
> HDFS-11180.02.patch, HDFS-11180.03.patch, HDFS-11180.04.patch, jstack.log
>
>
> It is happening due to metrics getting updated at the same time when failover 
> is happening. Please find attached jstack at that point of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10453) ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.

2016-11-30 Thread Taklon Stephen Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709564#comment-15709564
 ] 

Taklon Stephen Wu commented on HDFS-10453:
--

is this issue still ongoing?

> ReplicationMonitor thread could stuck for long time due to the race between 
> replication and delete of same file in a large cluster.
> ---
>
> Key: HDFS-10453
> URL: https://issues.apache.org/jira/browse/HDFS-10453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1, 2.5.2, 2.7.1, 2.6.4
>Reporter: He Xiaoqiao
> Attachments: HDFS-10453-branch-2.001.patch, 
> HDFS-10453-branch-2.003.patch, HDFS-10453.001.patch
>
>
> ReplicationMonitor thread could stuck for long time and loss data with little 
> probability. Consider the typical scenario:
> (1) create and close a file with the default replicas(3);
> (2) increase replication (to 10) of the file.
> (3) delete the file while ReplicationMonitor is scheduling blocks belong to 
> that file for replications.
> if ReplicationMonitor stuck reappeared, NameNode will print log as:
> {code:xml}
> 2016-04-19 10:20:48,083 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> ..
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) For more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough 
> replicas: expected size is 7 but only 0 storage types can be selected 
> (replication=10, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK, 
> DISK, DISK, DISK, DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2016-04-19 10:21:17,184 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
> place enough replicas, still in need of 7 to reach 10 
> (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, 
> newBlock=false) All required storage types are unavailable:  
> unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, 
> storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> {code}
> This is because 2 threads (#NameNodeRpcServer and #ReplicationMonitor) 
> process same block at the same moment.
> (1) ReplicationMonitor#computeReplicationWorkForBlocks get blocks to 
> replicate and leave the global lock.
> (2) FSNamesystem#delete invoked to delete blocks then clear the reference in 
> blocksmap, needReplications, etc. the block's NumBytes will set 
> NO_ACK(Long.MAX_VALUE) which is used to indicate that the block deletion does 
> not need explicit ACK from the node. 
> (3) ReplicationMonitor#computeReplicationWorkForBlocks continue to 
> chooseTargets for the same blocks and no node will be selected after traverse 
> whole cluster because  no node choice satisfy the goodness criteria 
> (remaining spaces achieve required size Long.MAX_VALUE). 
> During of stage#3 ReplicationMonitor stuck for long time, especial in a large 
> cluster. invalidateBlocks & neededReplications continues to grow and no 
> consumes. it will loss data at the worst.
> This can mostly be avoided by skip chooseTarget for BlockCommand.NO_ACK block 
> and remove it from neededReplications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709429#comment-15709429
 ] 

Xiaobing Zhou commented on HDFS-10913:
--

Posted v005 patch to address your comments, thanks [~arpitagarwal] for reviews.

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10913:
-
Attachment: HDFS-10913.005.patch

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10913:
-
Attachment: (was: HDFS-10913.005.patch)

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709245#comment-15709245
 ] 

Hadoop QA commented on HDFS-10913:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-10913 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10913 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841087/HDFS-10913.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17712/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10913) Introduce fault injectors to simulate slow mirrors

2016-11-30 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10913:
-
Attachment: HDFS-10913.005.patch

> Introduce fault injectors to simulate slow mirrors
> --
>
> Key: HDFS-10913
> URL: https://issues.apache.org/jira/browse/HDFS-10913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-10913.000.patch, HDFS-10913.001.patch, 
> HDFS-10913.002.patch, HDFS-10913.003.patch, HDFS-10913.004.patch, 
> HDFS-10913.005.patch
>
>
> BlockReceiver#datanodeSlowLogThresholdMs is used as threshold to detect slow 
> mirrors. BlockReceiver only writes some warning logs. In order to better test 
> behaviors of slow mirrors, it necessitates introducing fault injectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8678) Bring back the feature to view chunks of files in the HDFS file browser

2016-11-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709226#comment-15709226
 ] 

Hudson commented on HDFS-8678:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10914 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10914/])
HDFS-8678. Bring back the feature to view chunks of files in the HDFS 
(raviprak: rev 625df87c7b8ec2787e743d845fadde5e73479dc1)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js


> Bring back the feature to view chunks of files in the HDFS file browser
> ---
>
> Key: HDFS-8678
> URL: https://issues.apache.org/jira/browse/HDFS-8678
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ui
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 2.9.0
>
> Attachments: HDFS-8678.01.patch, HDFS-8678.02.patch, 
> HDFS-8678.03.patch
>
>
> The legacy file browser displayed small chunks of a file in the browser 
> itself. This was useful to users because they can quickly verify that their 
> input or output is in the format they expect. We should bring back this 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7588) Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files

2016-11-30 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709194#comment-15709194
 ] 

Ravi Prakash commented on HDFS-7588:


Thanks for all the reviews Allen and Haohui and others. All the features in 
this JIRA are now in branch-2. This concludes this JIRA

> Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs 
> and uploading files
> ---
>
> Key: HDFS-7588
> URL: https://issues.apache.org/jira/browse/HDFS-7588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui, webhdfs
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 2.9.0
>
> Attachments: HDFS-7588.01.patch, HDFS-7588.02.patch
>
>
> The new HTML5 web browser is neat, however it lacks a few features that might 
> make it more useful:
> 1. chown
> 2. chmod
> 3. Uploading files
> 4. mkdir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7588) Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files

2016-11-30 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-7588:
---
Fix Version/s: 2.9.0

> Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs 
> and uploading files
> ---
>
> Key: HDFS-7588
> URL: https://issues.apache.org/jira/browse/HDFS-7588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui, webhdfs
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 2.9.0
>
> Attachments: HDFS-7588.01.patch, HDFS-7588.02.patch
>
>
> The new HTML5 web browser is neat, however it lacks a few features that might 
> make it more useful:
> 1. chown
> 2. chmod
> 3. Uploading files
> 4. mkdir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Closed] (HDFS-7588) Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files

2016-11-30 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed HDFS-7588.
--

> Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs 
> and uploading files
> ---
>
> Key: HDFS-7588
> URL: https://issues.apache.org/jira/browse/HDFS-7588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ui, webhdfs
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 2.9.0
>
> Attachments: HDFS-7588.01.patch, HDFS-7588.02.patch
>
>
> The new HTML5 web browser is neat, however it lacks a few features that might 
> make it more useful:
> 1. chown
> 2. chmod
> 3. Uploading files
> 4. mkdir



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8678) Bring back the feature to view chunks of files in the HDFS file browser

2016-11-30 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-8678:
---
   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

> Bring back the feature to view chunks of files in the HDFS file browser
> ---
>
> Key: HDFS-8678
> URL: https://issues.apache.org/jira/browse/HDFS-8678
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ui
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 2.9.0
>
> Attachments: HDFS-8678.01.patch, HDFS-8678.02.patch, 
> HDFS-8678.03.patch
>
>
> The legacy file browser displayed small chunks of a file in the browser 
> itself. This was useful to users because they can quickly verify that their 
> input or output is in the format they expect. We should bring back this 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >