[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201762#comment-14201762
 ] 

Hadoop QA commented on HDFS-7279:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12679588/HDFS-7279.007.patch
  against trunk revision 61effcb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestFsck
org.apache.hadoop.hdfs.server.namenode.TestDeleteRace

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8687//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8687//console

This message is automatically generated.

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201763#comment-14201763
 ] 

Hadoop QA commented on HDFS-7314:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680087/HDFS-7314-4.patch
  against trunk revision ba0a42c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestFsck
org.apache.hadoop.hdfs.server.namenode.TestDeleteRace

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8686//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8686//console

This message is automatically generated.

 Aborted DFSClient's impact on long running service like YARN
 

 Key: HDFS-7314
 URL: https://issues.apache.org/jira/browse/HDFS-7314
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
 HDFS-7314.patch


 It happened in YARN nodemanger scenario. But it could happen to any long 
 running service that use cached instance of DistrbutedFileSystem.
 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
 any DFSClient request will get ConnectTimeoutException.
 2. YARN nodemanager use DFSClient for certain write operation such as log 
 aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
 renewLease RPC got ConnectTimeoutException.
 {noformat}
 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
 renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
 Aborting ...
 {noformat}
 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
 instance of DistributedFileSystem.
 {noformat}
 2014-10-29 20:26:23,991 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Failed to download rsrc...
 java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
 Given the callstack is YARN - DistributedFileSystem - DFSClient, this can 
 be addressed at different layers.
 * YARN closes the DistributedFileSystem object when it receives some well 
 defined exception. Then the next HDFS call will create a new instance of 
 DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
 applications need to address this as well.
 * DistributedFileSystem detects Aborted DFSClient and create a new instance 
 of DFSClient. We will need to fix all the places 

[jira] [Commented] (HDFS-7310) Mover can give first priority to local DN if it has target storage type available in local DN

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201801#comment-14201801
 ] 

Hadoop QA commented on HDFS-7310:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680093/HDFS-7310-003.patch
  against trunk revision 61effcb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8688//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8688//console

This message is automatically generated.

 Mover can give first priority to local DN if it has target storage type 
 available in local DN
 -

 Key: HDFS-7310
 URL: https://issues.apache.org/jira/browse/HDFS-7310
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Vinayakumar B
 Attachments: HDFS-7310-001.patch, HDFS-7310-002.patch, 
 HDFS-7310-003.patch


 Currently Mover logic may move blocks to any DN which had target storage 
 type. But if the src DN has target storage type then mover can give highest 
 priority to local DN. If local DN does not contains target storage type, then 
 it can assign to any DN as the current logic does.
   This is a thought, have not go through the code fully yet.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7376) Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7

2014-11-07 Thread Johannes Zillmann (JIRA)
Johannes Zillmann created HDFS-7376:
---

 Summary: Upgrade jsch lib to jsch-0.1.51 to avoid problems running 
on java7
 Key: HDFS-7376
 URL: https://issues.apache.org/jira/browse/HDFS-7376
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Johannes Zillmann


We had an application sitting on top of Hadoop and got problems using jsch once 
we switched to java 7. Got this exception:
{noformat}
 com.jcraft.jsch.JSchException: verify: false
at com.jcraft.jsch.Session.connect(Session.java:330)
at com.jcraft.jsch.Session.connect(Session.java:183)
{noformat}

Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
got in conflict with hadoop's jsch version (we fixed this for us by jarjar'ing 
our jsch version).

So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
check if the ssh part is properly working for java7 or preventively upgrade the 
jsch lib to jsch-0.1.51!

Some references to problems reported:
- 
http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
-https://issues.apache.org/bugzilla/show_bug.cgi?id=53437



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7376) Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7

2014-11-07 Thread Johannes Zillmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Zillmann updated HDFS-7376:

Description: 
We had an application sitting on top of Hadoop and got problems using jsch once 
we switched to java 7. Got this exception:
{noformat}
 com.jcraft.jsch.JSchException: verify: false
at com.jcraft.jsch.Session.connect(Session.java:330)
at com.jcraft.jsch.Session.connect(Session.java:183)
{noformat}

Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
got in conflict with hadoop's jsch version (we fixed this for us by jarjar'ing 
our jsch version).

So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
check if the ssh part is properly working for java7 or preventively upgrade the 
jsch lib to jsch-0.1.51!

Some references to problems reported:
- 
http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
- https://issues.apache.org/bugzilla/show_bug.cgi?id=53437

  was:
We had an application sitting on top of Hadoop and got problems using jsch once 
we switched to java 7. Got this exception:
{noformat}
 com.jcraft.jsch.JSchException: verify: false
at com.jcraft.jsch.Session.connect(Session.java:330)
at com.jcraft.jsch.Session.connect(Session.java:183)
{noformat}

Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
got in conflict with hadoop's jsch version (we fixed this for us by jarjar'ing 
our jsch version).

So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
check if the ssh part is properly working for java7 or preventively upgrade the 
jsch lib to jsch-0.1.51!

Some references to problems reported:
- 
http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
-https://issues.apache.org/bugzilla/show_bug.cgi?id=53437


 Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7
 --

 Key: HDFS-7376
 URL: https://issues.apache.org/jira/browse/HDFS-7376
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Johannes Zillmann

 We had an application sitting on top of Hadoop and got problems using jsch 
 once we switched to java 7. Got this exception:
 {noformat}
  com.jcraft.jsch.JSchException: verify: false
   at com.jcraft.jsch.Session.connect(Session.java:330)
   at com.jcraft.jsch.Session.connect(Session.java:183)
 {noformat}
 Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
 got in conflict with hadoop's jsch version (we fixed this for us by 
 jarjar'ing our jsch version).
 So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
 check if the ssh part is properly working for java7 or preventively upgrade 
 the jsch lib to jsch-0.1.51!
 Some references to problems reported:
 - 
 http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
 - https://issues.apache.org/bugzilla/show_bug.cgi?id=53437



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7377) [ DataNode Web UI ] Metric page will not display anything..

2014-11-07 Thread Brahma Reddy Battula (JIRA)
Brahma Reddy Battula created HDFS-7377:
--

 Summary: [ DataNode Web UI ] Metric page will not display 
anything..
 Key: HDFS-7377
 URL: https://issues.apache.org/jira/browse/HDFS-7377
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Priority: Critical


Scenario :
==

Go to http://DN_IP:http port/dataNodeHome.jsp
and click on metrics link..
we will not able to see any thing..

Did not find reason..do we need to implent metric page..?

Checked the HDFS-2933,but did not find any clue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7376) Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7

2014-11-07 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-7376:
-
 Component/s: build
Target Version/s: 2.7.0

 Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7
 --

 Key: HDFS-7376
 URL: https://issues.apache.org/jira/browse/HDFS-7376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Reporter: Johannes Zillmann

 We had an application sitting on top of Hadoop and got problems using jsch 
 once we switched to java 7. Got this exception:
 {noformat}
  com.jcraft.jsch.JSchException: verify: false
   at com.jcraft.jsch.Session.connect(Session.java:330)
   at com.jcraft.jsch.Session.connect(Session.java:183)
 {noformat}
 Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
 got in conflict with hadoop's jsch version (we fixed this for us by 
 jarjar'ing our jsch version).
 So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
 check if the ssh part is properly working for java7 or preventively upgrade 
 the jsch lib to jsch-0.1.51!
 Some references to problems reported:
 - 
 http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
 - https://issues.apache.org/bugzilla/show_bug.cgi?id=53437



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7376) Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7

2014-11-07 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201820#comment-14201820
 ] 

Steve Loughran commented on HDFS-7376:
--

not seen any reports of this -yet- but it still seems worth doing. It's too 
late to do it for 2.6; targeting 2.7, which is java7+ only

 Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7
 --

 Key: HDFS-7376
 URL: https://issues.apache.org/jira/browse/HDFS-7376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Reporter: Johannes Zillmann

 We had an application sitting on top of Hadoop and got problems using jsch 
 once we switched to java 7. Got this exception:
 {noformat}
  com.jcraft.jsch.JSchException: verify: false
   at com.jcraft.jsch.Session.connect(Session.java:330)
   at com.jcraft.jsch.Session.connect(Session.java:183)
 {noformat}
 Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it 
 got in conflict with hadoop's jsch version (we fixed this for us by 
 jarjar'ing our jsch version).
 So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should 
 check if the ssh part is properly working for java7 or preventively upgrade 
 the jsch lib to jsch-0.1.51!
 Some references to problems reported:
 - 
 http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/
 - https://issues.apache.org/bugzilla/show_bug.cgi?id=53437



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201958#comment-14201958
 ] 

Hudson commented on HDFS-7221:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #736 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/736/])
HDFS-7221. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
e7f1c0482e5dff8a1549ace1fc2b366941170c58)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencingWithReplication fails consistently
 ---

 Key: HDFS-7221
 URL: https://issues.apache.org/jira/browse/HDFS-7221
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Fix For: 2.6.0

 Attachments: HDFS-7221.001.patch, HDFS-7221.002.patch, 
 HDFS-7221.003.patch, HDFS-7221.004.patch, HDFS-7221.005.patch


 TestDNFencingWithReplication consistently fails with a timeout, both in 
 jenkins runs and on my local machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7364) Balancer always shows zero Bytes Already Moved

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201956#comment-14201956
 ] 

Hudson commented on HDFS-7364:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #736 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/736/])
HDFS-7364. Balancer always shows zero Bytes Already Moved. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev ae71a671a3b4b454aa393c2974b6f1f16dd61405)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java


 Balancer always shows zero Bytes Already Moved
 --

 Key: HDFS-7364
 URL: https://issues.apache.org/jira/browse/HDFS-7364
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h7364_20141105.patch, h7364_20141106.patch


 Here is an example:
 {noformat}
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Nov 5, 2014 5:23:38 PM0  0 B   116.82 MB  
 181.07 MB
 Nov 5, 2014 5:24:30 PM1  0 B88.05 MB  
 181.07 MB
 Nov 5, 2014 5:25:10 PM2  0 B73.08 MB  
 181.07 MB
 Nov 5, 2014 5:25:49 PM3  0 B13.37 MB  
  90.53 MB
 Nov 5, 2014 5:26:30 PM4  0 B13.59 MB  
  90.53 MB
 Nov 5, 2014 5:27:12 PM5  0 B 9.25 MB  
  90.53 MB
 The cluster is balanced. Exiting...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7365) Remove hdfs.server.blockmanagement.MutableBlockCollection

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201955#comment-14201955
 ] 

Hudson commented on HDFS-7365:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #736 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/736/])
HDFS-7365. Remove hdfs.server.blockmanagement.MutableBlockCollection. 
Contributed by Li Lu. (wheat9: rev 75b820cca9d4e709b9e8d40635ff0406528ad4ba)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/MutableBlockCollection.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Remove hdfs.server.blockmanagement.MutableBlockCollection
 -

 Key: HDFS-7365
 URL: https://issues.apache.org/jira/browse/HDFS-7365
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Li Lu
Assignee: Li Lu
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7365-110514.patch


 Seems like this component is no longer referenced. Is it OK to fully remove 
 it? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7226) TestDNFencing.testQueueingWithAppend failed often in latest test

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201949#comment-14201949
 ] 

Hudson commented on HDFS-7226:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #736 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/736/])
HDFS-7226. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
d026f3676278e24d7032dced5f14b52dec70b987)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencing.testQueueingWithAppend failed often in latest test
 

 Key: HDFS-7226
 URL: https://issues.apache.org/jira/browse/HDFS-7226
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.6.0

 Attachments: HDFS-7226.001.patch, HDFS-7226.002.patch, 
 HDFS-7226.003.patch


 Using tool from HADOOP-11045, got the following report:
 {code}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 PreCommit-HDFS-Build -n 1 
 Recently FAILED builds in url: 
 https://builds.apache.org//job/PreCommit-HDFS-Build
 THERE ARE 9 builds (out of 9) that have failed tests in the past 1 days, 
 as listed below:
 ..
 Among 9 runs examined, all failed tests #failedRuns: testName:
 7: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 6: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
 3: 
 org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testFailedOpen
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testSyncBatching
 ..
 {code}
 TestDNFencingWithReplication.testFencingStress was reported as HDFS-7221. 
 Creating this jira for TestDNFencing.testQueueingWithAppend.
 Symptom:
 {code}
 Failed
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 Failing for the past 1 build (Since Failed#8390 )
 Took 2.9 sec.
 Error Message
 expected:18 but was:12
 Stacktrace
 java.lang.AssertionError: expected:18 but was:12
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend(TestDNFencing.java:448)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7378) There should be a method to quickly test if a file is a SequenceFile or if a stream contains SequenceFile data

2014-11-07 Thread Jens Rabe (JIRA)
Jens Rabe created HDFS-7378:
---

 Summary: There should be a method to quickly test if a file is a 
SequenceFile or if a stream contains SequenceFile data
 Key: HDFS-7378
 URL: https://issues.apache.org/jira/browse/HDFS-7378
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Jens Rabe
Priority: Trivial


Currently, to check whether a file is a SequenceFile or a stream contains data 
in SequenceFile format, one either has to check the message of the thrown 
exception when opening a file with SequenceFile.Reader, or has to check the 
first four bytes by him/herself.

A utility method like SequenceFile.isSequenceFile would be very handy here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7377) [ DataNode Web UI ] Metric page will not display anything..

2014-11-07 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-7377:
--

Assignee: Brahma Reddy Battula

 [ DataNode Web UI ] Metric page will not display anything..
 ---

 Key: HDFS-7377
 URL: https://issues.apache.org/jira/browse/HDFS-7377
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

 Scenario :
 ==
 Go to http://DN_IP:http port/dataNodeHome.jsp
 and click on metrics link..
 we will not able to see any thing..
 Did not find reason..do we need to implent metric page..?
 Checked the HDFS-2933,but did not find any clue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7377) [ DataNode Web UI ] Metric page will not display anything..

2014-11-07 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202034#comment-14202034
 ] 

Brahma Reddy Battula commented on HDFS-7377:


As format will  come as null when we access from http://DN_IP:http 
port/dataNodeHome.jsp, resonse will be null...Please check the following code 
for same..

{code}
public void doGet(HttpServletRequest request, HttpServletResponse response)
  throws ServletException, IOException {

if (!HttpServer2.isInstrumentationAccessAllowed(getServletContext(),
   request, response)) {
  return;
}

String format = request.getParameter(format);
CollectionMetricsContext allContexts = 
  ContextFactory.getFactory().getAllContexts();
if (json.equals(format)) {
  response.setContentType(application/json; charset=utf-8);
  PrintWriter out = response.getWriter();
  try {
// Uses Jetty's built-in JSON support to convert the map into JSON.
out.print(new JSON().toJSON(makeMap(allContexts)));
  } finally {
out.close();
  }
} else {
  PrintWriter out = response.getWriter();
  try {
printMap(out, makeMap(allContexts));
  } finally {
out.close();
  }
}
  }
{code}


can we add like blow..as we are handling the json format...
{code}
if (null == format) {
  format = FORMAT_JSON;
}
{code}


 [ DataNode Web UI ] Metric page will not display anything..
 ---

 Key: HDFS-7377
 URL: https://issues.apache.org/jira/browse/HDFS-7377
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Priority: Critical

 Scenario :
 ==
 Go to http://DN_IP:http port/dataNodeHome.jsp
 and click on metrics link..
 we will not able to see any thing..
 Did not find reason..do we need to implent metric page..?
 Checked the HDFS-2933,but did not find any clue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7377) [ DataNode Web UI ] Metric page will not display anything..

2014-11-07 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202049#comment-14202049
 ] 

Brahma Reddy Battula commented on HDFS-7377:


Problem is contextMap having null value...:).

{code}
CollectionMetricsContext allContexts =
  ContextFactory.getFactory().getAllContexts();
{code}

 [ DataNode Web UI ] Metric page will not display anything..
 ---

 Key: HDFS-7377
 URL: https://issues.apache.org/jira/browse/HDFS-7377
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

 Scenario :
 ==
 Go to http://DN_IP:http port/dataNodeHome.jsp
 and click on metrics link..
 we will not able to see any thing..
 Did not find reason..do we need to implent metric page..?
 Checked the HDFS-2933,but did not find any clue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7331) Add Datanode network counts to datanode jmx page

2014-11-07 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7331:
---
Attachment: HDFS-7331.003.patch

[~wheat9], [~atm],

Thanks for the comments and suggestions. Attached is the .003 patch which 
removes the servlet, defaults the size of the bounded cache to Int.MAX_VALUE, 
and uses only the hostname/address (without the port) as the key to the cache.


 Add Datanode network counts to datanode jmx page
 

 Key: HDFS-7331
 URL: https://issues.apache.org/jira/browse/HDFS-7331
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, 
 HDFS-7331.003.patch


 Add per-datanode counts to the datanode jmx page. For example, networkErrors 
 could be exposed like this:
 {noformat}
   }, {
 ...
 DatanodeNetworkCounts : {\dn1\:{\networkErrors\:1}},
 ...
 NamenodeAddresses : 
 {\localhost\:\BP-1103235125-127.0.0.1-1415057084497\},
 VolumeInfo : 
 {\/tmp/hadoop-cwl/dfs/data/current\:{\freeSpace\:3092725760,\usedSpace\:28672,\reservedSpace\:0}},
 ClusterId : CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e
   }, {
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7226) TestDNFencing.testQueueingWithAppend failed often in latest test

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202072#comment-14202072
 ] 

Hudson commented on HDFS-7226:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1926/])
HDFS-7226. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
d026f3676278e24d7032dced5f14b52dec70b987)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencing.testQueueingWithAppend failed often in latest test
 

 Key: HDFS-7226
 URL: https://issues.apache.org/jira/browse/HDFS-7226
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.6.0

 Attachments: HDFS-7226.001.patch, HDFS-7226.002.patch, 
 HDFS-7226.003.patch


 Using tool from HADOOP-11045, got the following report:
 {code}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 PreCommit-HDFS-Build -n 1 
 Recently FAILED builds in url: 
 https://builds.apache.org//job/PreCommit-HDFS-Build
 THERE ARE 9 builds (out of 9) that have failed tests in the past 1 days, 
 as listed below:
 ..
 Among 9 runs examined, all failed tests #failedRuns: testName:
 7: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 6: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
 3: 
 org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testFailedOpen
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testSyncBatching
 ..
 {code}
 TestDNFencingWithReplication.testFencingStress was reported as HDFS-7221. 
 Creating this jira for TestDNFencing.testQueueingWithAppend.
 Symptom:
 {code}
 Failed
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 Failing for the past 1 build (Since Failed#8390 )
 Took 2.9 sec.
 Error Message
 expected:18 but was:12
 Stacktrace
 java.lang.AssertionError: expected:18 but was:12
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend(TestDNFencing.java:448)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202081#comment-14202081
 ] 

Hudson commented on HDFS-7221:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1926/])
HDFS-7221. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
e7f1c0482e5dff8a1549ace1fc2b366941170c58)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencingWithReplication fails consistently
 ---

 Key: HDFS-7221
 URL: https://issues.apache.org/jira/browse/HDFS-7221
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Fix For: 2.6.0

 Attachments: HDFS-7221.001.patch, HDFS-7221.002.patch, 
 HDFS-7221.003.patch, HDFS-7221.004.patch, HDFS-7221.005.patch


 TestDNFencingWithReplication consistently fails with a timeout, both in 
 jenkins runs and on my local machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7365) Remove hdfs.server.blockmanagement.MutableBlockCollection

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202078#comment-14202078
 ] 

Hudson commented on HDFS-7365:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1926/])
HDFS-7365. Remove hdfs.server.blockmanagement.MutableBlockCollection. 
Contributed by Li Lu. (wheat9: rev 75b820cca9d4e709b9e8d40635ff0406528ad4ba)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/MutableBlockCollection.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Remove hdfs.server.blockmanagement.MutableBlockCollection
 -

 Key: HDFS-7365
 URL: https://issues.apache.org/jira/browse/HDFS-7365
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Li Lu
Assignee: Li Lu
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7365-110514.patch


 Seems like this component is no longer referenced. Is it OK to fully remove 
 it? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7364) Balancer always shows zero Bytes Already Moved

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202079#comment-14202079
 ] 

Hudson commented on HDFS-7364:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1926 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1926/])
HDFS-7364. Balancer always shows zero Bytes Already Moved. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev ae71a671a3b4b454aa393c2974b6f1f16dd61405)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java


 Balancer always shows zero Bytes Already Moved
 --

 Key: HDFS-7364
 URL: https://issues.apache.org/jira/browse/HDFS-7364
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h7364_20141105.patch, h7364_20141106.patch


 Here is an example:
 {noformat}
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Nov 5, 2014 5:23:38 PM0  0 B   116.82 MB  
 181.07 MB
 Nov 5, 2014 5:24:30 PM1  0 B88.05 MB  
 181.07 MB
 Nov 5, 2014 5:25:10 PM2  0 B73.08 MB  
 181.07 MB
 Nov 5, 2014 5:25:49 PM3  0 B13.37 MB  
  90.53 MB
 Nov 5, 2014 5:26:30 PM4  0 B13.59 MB  
  90.53 MB
 Nov 5, 2014 5:27:12 PM5  0 B 9.25 MB  
  90.53 MB
 The cluster is balanced. Exiting...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7365) Remove hdfs.server.blockmanagement.MutableBlockCollection

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202152#comment-14202152
 ] 

Hudson commented on HDFS-7365:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1950 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1950/])
HDFS-7365. Remove hdfs.server.blockmanagement.MutableBlockCollection. 
Contributed by Li Lu. (wheat9: rev 75b820cca9d4e709b9e8d40635ff0406528ad4ba)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/MutableBlockCollection.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Remove hdfs.server.blockmanagement.MutableBlockCollection
 -

 Key: HDFS-7365
 URL: https://issues.apache.org/jira/browse/HDFS-7365
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Li Lu
Assignee: Li Lu
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7365-110514.patch


 Seems like this component is no longer referenced. Is it OK to fully remove 
 it? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7226) TestDNFencing.testQueueingWithAppend failed often in latest test

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202146#comment-14202146
 ] 

Hudson commented on HDFS-7226:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1950 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1950/])
HDFS-7226. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
d026f3676278e24d7032dced5f14b52dec70b987)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencing.testQueueingWithAppend failed often in latest test
 

 Key: HDFS-7226
 URL: https://issues.apache.org/jira/browse/HDFS-7226
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.6.0

 Attachments: HDFS-7226.001.patch, HDFS-7226.002.patch, 
 HDFS-7226.003.patch


 Using tool from HADOOP-11045, got the following report:
 {code}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 PreCommit-HDFS-Build -n 1 
 Recently FAILED builds in url: 
 https://builds.apache.org//job/PreCommit-HDFS-Build
 THERE ARE 9 builds (out of 9) that have failed tests in the past 1 days, 
 as listed below:
 ..
 Among 9 runs examined, all failed tests #failedRuns: testName:
 7: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 6: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
 3: 
 org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testFailedOpen
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testSyncBatching
 ..
 {code}
 TestDNFencingWithReplication.testFencingStress was reported as HDFS-7221. 
 Creating this jira for TestDNFencing.testQueueingWithAppend.
 Symptom:
 {code}
 Failed
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 Failing for the past 1 build (Since Failed#8390 )
 Took 2.9 sec.
 Error Message
 expected:18 but was:12
 Stacktrace
 java.lang.AssertionError: expected:18 but was:12
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend(TestDNFencing.java:448)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7221) TestDNFencingWithReplication fails consistently

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202155#comment-14202155
 ] 

Hudson commented on HDFS-7221:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1950 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1950/])
HDFS-7221. Update CHANGES.txt to indicate fix in 2.6.0. (cnauroth: rev 
e7f1c0482e5dff8a1549ace1fc2b366941170c58)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestDNFencingWithReplication fails consistently
 ---

 Key: HDFS-7221
 URL: https://issues.apache.org/jira/browse/HDFS-7221
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Fix For: 2.6.0

 Attachments: HDFS-7221.001.patch, HDFS-7221.002.patch, 
 HDFS-7221.003.patch, HDFS-7221.004.patch, HDFS-7221.005.patch


 TestDNFencingWithReplication consistently fails with a timeout, both in 
 jenkins runs and on my local machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7364) Balancer always shows zero Bytes Already Moved

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202153#comment-14202153
 ] 

Hudson commented on HDFS-7364:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1950 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1950/])
HDFS-7364. Balancer always shows zero Bytes Already Moved. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev ae71a671a3b4b454aa393c2974b6f1f16dd61405)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java


 Balancer always shows zero Bytes Already Moved
 --

 Key: HDFS-7364
 URL: https://issues.apache.org/jira/browse/HDFS-7364
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h7364_20141105.patch, h7364_20141106.patch


 Here is an example:
 {noformat}
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Nov 5, 2014 5:23:38 PM0  0 B   116.82 MB  
 181.07 MB
 Nov 5, 2014 5:24:30 PM1  0 B88.05 MB  
 181.07 MB
 Nov 5, 2014 5:25:10 PM2  0 B73.08 MB  
 181.07 MB
 Nov 5, 2014 5:25:49 PM3  0 B13.37 MB  
  90.53 MB
 Nov 5, 2014 5:26:30 PM4  0 B13.59 MB  
  90.53 MB
 Nov 5, 2014 5:27:12 PM5  0 B 9.25 MB  
  90.53 MB
 The cluster is balanced. Exiting...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7226) TestDNFencing.testQueueingWithAppend failed often in latest test

2014-11-07 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202189#comment-14202189
 ] 

Yongjun Zhang commented on HDFS-7226:
-

HI [~cnauroth], thanks a lot for taking care of the merge!


 TestDNFencing.testQueueingWithAppend failed often in latest test
 

 Key: HDFS-7226
 URL: https://issues.apache.org/jira/browse/HDFS-7226
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.6.0

 Attachments: HDFS-7226.001.patch, HDFS-7226.002.patch, 
 HDFS-7226.003.patch


 Using tool from HADOOP-11045, got the following report:
 {code}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 PreCommit-HDFS-Build -n 1 
 Recently FAILED builds in url: 
 https://builds.apache.org//job/PreCommit-HDFS-Build
 THERE ARE 9 builds (out of 9) that have failed tests in the past 1 days, 
 as listed below:
 ..
 Among 9 runs examined, all failed tests #failedRuns: testName:
 7: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 6: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress
 3: 
 org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testOpenFilesWithMultipleSnapshots
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testFailedOpen
 1: org.apache.hadoop.hdfs.server.namenode.TestEditLog.testSyncBatching
 ..
 {code}
 TestDNFencingWithReplication.testFencingStress was reported as HDFS-7221. 
 Creating this jira for TestDNFencing.testQueueingWithAppend.
 Symptom:
 {code}
 Failed
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend
 Failing for the past 1 build (Since Failed#8390 )
 Took 2.9 sec.
 Error Message
 expected:18 but was:12
 Stacktrace
 java.lang.AssertionError: expected:18 but was:12
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend(TestDNFencing.java:448)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202203#comment-14202203
 ] 

stack commented on HDFS-6803:
-

Am I in the right ballpark? Thanks (Need license to hack on dfsinputstream to 
make it more 'live' -- thanks).

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202299#comment-14202299
 ] 

Hadoop QA commented on HDFS-7331:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680170/HDFS-7331.003.patch
  against trunk revision 42bbe37.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1219 javac 
compiler warnings (more than the trunk's current 1218 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8689//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8689//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8689//console

This message is automatically generated.

 Add Datanode network counts to datanode jmx page
 

 Key: HDFS-7331
 URL: https://issues.apache.org/jira/browse/HDFS-7331
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, 
 HDFS-7331.003.patch


 Add per-datanode counts to the datanode jmx page. For example, networkErrors 
 could be exposed like this:
 {noformat}
   }, {
 ...
 DatanodeNetworkCounts : {\dn1\:{\networkErrors\:1}},
 ...
 NamenodeAddresses : 
 {\localhost\:\BP-1103235125-127.0.0.1-1415057084497\},
 VolumeInfo : 
 {\/tmp/hadoop-cwl/dfs/data/current\:{\freeSpace\:3092725760,\usedSpace\:28672,\reservedSpace\:0}},
 ClusterId : CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e
   }, {
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202348#comment-14202348
 ] 

Zhe Zhang commented on HDFS-7374:
-

[~mingma] Thanks much for clarifying the state machine. I agree my option #2 is 
cleaner and makes the decommissioning of dead nodes much faster. I'll go ahead 
with that approach now. 

bq. If the node stays in Dead, DECOMMISSION_INPROGRESS for too long, have the 
higher layer application remove the node from exclude file and thus abort the 
decommission process. This will transition the node to Dead, NORMAL.

The specific higher layer application in my case is Cloudera Manager and I 
think it's possible to add this logic. However I don't know how easy it is to 
change all similar management applications.

bq.  HDFS-6791 mentioned another way to address the original issue. When nodes 
become dead, mark them DECOMMISSIONED and fix the replication to handle this 
case. In other words, get rid of Dead, DECOMMISSION_INPROGRESS state.

Do you mean allowing a {{DECOMMISSIONED}} node to be the source of a replica 
transfer? It seems a little fragile to me; intuitively, it could surprise upper 
layer applications that a {{DECOMMISSIONED}} node is still actively 
transferring data. But I would like to hear the opinions from other people.

 Allow decommissioning of dead DataNodes
 ---

 Key: HDFS-7374
 URL: https://issues.apache.org/jira/browse/HDFS-7374
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang

 We have seen the use case of decommissioning DataNodes that are already dead 
 or unresponsive, and not expected to rejoin the cluster.
 The logic introduced by HDFS-6791 will mark those nodes as 
 {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
 the decommission work. If an upper layer application is monitoring the 
 decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7331) Add Datanode network counts to datanode jmx page

2014-11-07 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7331:
---
Attachment: HDFS-7331.004.patch

.004 gets rid of the compiler warning.

 Add Datanode network counts to datanode jmx page
 

 Key: HDFS-7331
 URL: https://issues.apache.org/jira/browse/HDFS-7331
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, 
 HDFS-7331.003.patch, HDFS-7331.004.patch


 Add per-datanode counts to the datanode jmx page. For example, networkErrors 
 could be exposed like this:
 {noformat}
   }, {
 ...
 DatanodeNetworkCounts : {\dn1\:{\networkErrors\:1}},
 ...
 NamenodeAddresses : 
 {\localhost\:\BP-1103235125-127.0.0.1-1415057084497\},
 VolumeInfo : 
 {\/tmp/hadoop-cwl/dfs/data/current\:{\freeSpace\:3092725760,\usedSpace\:28672,\reservedSpace\:0}},
 ClusterId : CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e
   }, {
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202371#comment-14202371
 ] 

Ming Ma commented on HDFS-7374:
---

Yeah, the idea was to use {{DECOMMISSIONED}} node as the source node only when 
there is no {{NORMAL}} node available. Agree it breaks the state definition. 

 Allow decommissioning of dead DataNodes
 ---

 Key: HDFS-7374
 URL: https://issues.apache.org/jira/browse/HDFS-7374
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang

 We have seen the use case of decommissioning DataNodes that are already dead 
 or unresponsive, and not expected to rejoin the cluster.
 The logic introduced by HDFS-6791 will mark those nodes as 
 {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
 the decommission work. If an upper layer application is monitoring the 
 decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202385#comment-14202385
 ] 

Haohui Mai commented on HDFS-7279:
--

The test failures are unrelated.

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows

2014-11-07 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-7379:


 Summary: Fix unit test TestBalancer#testBalancerWithRamDisk 
failure on Windows
 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


There is a copy-paste error during for test file creation. The test is supposed 
to create two test files named path1 and path2 on RAM_DISK. But The error 
caused path1 to be created twice with the second creation overwrite (delete) 
the first one on RAM_DISK. This caused verification failure on certain windows 
test machines. The fix is to create file with different name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows

2014-11-07 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7379:
-
Attachment: HDFS-7379.01.patch

 Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows
 -

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure on 
 certain windows test machines. The fix is to create file with different name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows

2014-11-07 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7379:
-
Description: There is a copy-paste error during for test file creation. The 
test is supposed to create two test files named path1 and path2 on RAM_DISK. 
But The error caused path1 to be created twice with the second creation 
overwrite (delete) the first one on RAM_DISK. This caused verification failure 
for path2 as it never gets created. The fix is to create test files with the 
correct names.   (was: There is a copy-paste error during for test file 
creation. The test is supposed to create two test files named path1 and path2 
on RAM_DISK. But The error caused path1 to be created twice with the second 
creation overwrite (delete) the first one on RAM_DISK. This caused verification 
failure on certain windows test machines. The fix is to create file with 
different name. )

 Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows
 -

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7379:
-
Summary: Fix unit test TestBalancer#testBalancerWithRamDisk failure  (was: 
Fix unit test TestBalancer#testBalancerWithRamDisk failure on Windows)

 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7379:
-
Affects Version/s: 2.6.0
   Status: Patch Available  (was: Open)

 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202549#comment-14202549
 ] 

Haohui Mai commented on HDFS-7379:
--

+1 pending jenkins


 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7358:
--
Attachment: (was: h7358_20141107.patch)

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7358:
--
Attachment: h7358_20141107.patch

 Is 'State' the right name for this inner class that carries 
 state-of-stream-close and stuff to run on close? ...  Do you need this class? 
 It can't just be a method to call on close?
 
DFSOutputStream is big and lack of organization.  It has 30+ fields in 
DFSOutputSteam alone, not counting inner classes such as DataStreamer. I think 
it is better separate to group the fields describing the state of the stream 
together.  Since I am not going to move the other fields for the moment, let's 
keep closed as a field.  Here is a new patch.

h7358_20141107.patch


 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch, h7358_20141107.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7358:
--
Attachment: h7358_20141107.patch

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7358:
--
Attachment: (was: h7358_20141107.patch)

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7358:
--
Attachment: h7358_20141107.patch

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch, h7358_20141107.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202635#comment-14202635
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7358:
---

[~stack], have you changed dfs.bytes-per-checksum in your test?  What is 
version of Hadoop?
{code}
2014-11-04 16:55:57,202 DEBUG [sync.0] util.ByteArrayManager: allocate(65565): 
count=60367, aboveThreshold, [131072: 9998/1, free=1], recycled? true
{code}
I wonder why it allocates a 65565 ( 64kB) array.  See also HDFS-7308.

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch, h7358_20141107.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202686#comment-14202686
 ] 

Hadoop QA commented on HDFS-7331:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680207/HDFS-7331.004.patch
  against trunk revision 1e97f2f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8690//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8690//console

This message is automatically generated.

 Add Datanode network counts to datanode jmx page
 

 Key: HDFS-7331
 URL: https://issues.apache.org/jira/browse/HDFS-7331
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
 Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, 
 HDFS-7331.003.patch, HDFS-7331.004.patch


 Add per-datanode counts to the datanode jmx page. For example, networkErrors 
 could be exposed like this:
 {noformat}
   }, {
 ...
 DatanodeNetworkCounts : {\dn1\:{\networkErrors\:1}},
 ...
 NamenodeAddresses : 
 {\localhost\:\BP-1103235125-127.0.0.1-1415057084497\},
 VolumeInfo : 
 {\/tmp/hadoop-cwl/dfs/data/current\:{\freeSpace\:3092725760,\usedSpace\:28672,\reservedSpace\:0}},
 ClusterId : CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e
   }, {
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202685#comment-14202685
 ] 

Hadoop QA commented on HDFS-7379:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680234/HDFS-7379.01.patch
  against trunk revision 2ac1be7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8691//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8691//console

This message is automatically generated.

 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202691#comment-14202691
 ] 

Haohui Mai commented on HDFS-7379:
--

The test failures are unrelated. I'll commit this shortly.

 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) Fix unit test TestBalancer#testBalancerWithRamDisk failure

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7379:
--
Component/s: (was: datanode)
 test
   Priority: Minor  (was: Major)

 Fix unit test TestBalancer#testBalancerWithRamDisk failure
 --

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Priority: Minor
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) TestBalancer#testBalancerWithRamDisk creates test files incorrectly

2014-11-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7379:
-
Summary: TestBalancer#testBalancerWithRamDisk creates test files 
incorrectly  (was: Fix unit test TestBalancer#testBalancerWithRamDisk failure)

 TestBalancer#testBalancerWithRamDisk creates test files incorrectly
 ---

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Priority: Minor
 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7379) TestBalancer#testBalancerWithRamDisk creates test files incorrectly

2014-11-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7379:
-
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2 and branch-2.6. Thanks [~xyao] for 
the contribution.

 TestBalancer#testBalancerWithRamDisk creates test files incorrectly
 ---

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Priority: Minor
 Fix For: 2.6.0

 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7379) TestBalancer#testBalancerWithRamDisk creates test files incorrectly

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202744#comment-14202744
 ] 

Hudson commented on HDFS-7379:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6484 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6484/])
HDFS-7379. TestBalancer#testBalancerWithRamDisk creates test files incorrectly. 
Contributed by Xiaoyu Yao. (wheat9: rev 
57760c0663288a7611c3609891ef92f1abf4bb53)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestBalancer#testBalancerWithRamDisk creates test files incorrectly
 ---

 Key: HDFS-7379
 URL: https://issues.apache.org/jira/browse/HDFS-7379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Priority: Minor
 Fix For: 2.6.0

 Attachments: HDFS-7379.01.patch


 There is a copy-paste error during for test file creation. The test is 
 supposed to create two test files named path1 and path2 on RAM_DISK. But The 
 error caused path1 to be created twice with the second creation overwrite 
 (delete) the first one on RAM_DISK. This caused verification failure for 
 path2 as it never gets created. The fix is to create test files with the 
 correct names. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7380) unsteady and slow performance when writing to file with block size 2GB

2014-11-07 Thread Adam Fuchs (JIRA)
Adam Fuchs created HDFS-7380:


 Summary: unsteady and slow performance when writing to file with 
block size 2GB
 Key: HDFS-7380
 URL: https://issues.apache.org/jira/browse/HDFS-7380
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Adam Fuchs
 Attachments: BenchmarkWrites.java

Appending to a large file with block size  2GB can lead to periods of really 
poor performance (4X slower than optimal). I found this issue when looking at 
Accmulo write performance in ACCUMULO-3303. I wrote a small test application to 
isolate this performance down to some basic API calls (to be attached). A 
description of the execution can be found here: 
https://issues.apache.org/jira/browse/ACCUMULO-3303?focusedCommentId=14202830page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14202830

The specific hadoop version was as follows:
{code}
[root@n1 ~]# hadoop version
Hadoop 2.4.0.2.1.2.0-402
Subversion g...@github.com:hortonworks/hadoop.git -r 
9e5db004df1a751e93aa89b42956c5325f3a4482
Compiled by jenkins on 2014-04-27T22:28Z
Compiled with protoc 2.5.0
From source with checksum 9e788148daa5dd7934eb468e57e037b5
This command was run using /usr/lib/hadoop/hadoop-common-2.4.0.2.1.2.0-402.jar
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7380) unsteady and slow performance when writing to file with block size 2GB

2014-11-07 Thread Adam Fuchs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Fuchs updated HDFS-7380:
-
Attachment: BenchmarkWrites.java

 unsteady and slow performance when writing to file with block size 2GB
 ---

 Key: HDFS-7380
 URL: https://issues.apache.org/jira/browse/HDFS-7380
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Adam Fuchs
 Attachments: BenchmarkWrites.java


 Appending to a large file with block size  2GB can lead to periods of really 
 poor performance (4X slower than optimal). I found this issue when looking at 
 Accmulo write performance in ACCUMULO-3303. I wrote a small test application 
 to isolate this performance down to some basic API calls (to be attached). A 
 description of the execution can be found here: 
 https://issues.apache.org/jira/browse/ACCUMULO-3303?focusedCommentId=14202830page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14202830
 The specific hadoop version was as follows:
 {code}
 [root@n1 ~]# hadoop version
 Hadoop 2.4.0.2.1.2.0-402
 Subversion g...@github.com:hortonworks/hadoop.git -r 
 9e5db004df1a751e93aa89b42956c5325f3a4482
 Compiled by jenkins on 2014-04-27T22:28Z
 Compiled with protoc 2.5.0
 From source with checksum 9e788148daa5dd7934eb468e57e037b5
 This command was run using /usr/lib/hadoop/hadoop-common-2.4.0.2.1.2.0-402.jar
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7040) HDFS dangerously uses @Beta methods from very old versions of Guava

2014-11-07 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202856#comment-14202856
 ] 

Christopher Tubbs commented on HDFS-7040:
-

I added a patch to fix this under MAPREDUCE-6083 for versions 2.6.0 and later 
which doesn't change the Guava version dependency. I suppose it could be 
back-ported to earlier versions (2.4/2.5), but it's probably not worth it since 
those versions are really only affected by {{MiniDFSCluster}}, and that's very 
limited.

 HDFS dangerously uses @Beta methods from very old versions of Guava
 ---

 Key: HDFS-7040
 URL: https://issues.apache.org/jira/browse/HDFS-7040
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0, 2.5.0, 2.4.1
Reporter: Christopher Tubbs
  Labels: beta, deprecated, guava
 Attachments: 0001-HDFS-7040-Avoid-beta-LimitInputStream-in-Guava.patch


 HDFS uses LimitInputStream from Guava. This was introduced as @Beta and is 
 risky for any application to use.
 The problem is further exacerbated by Hadoop's dependency on Guava version 
 11.0.2, which is quite old for an active project (Feb. 2012).
 Because Guava is very stable, projects which depend on Hadoop and use Guava 
 themselves, can use up through Guava version 14.x
 However, in version 14, Guava deprecated LimitInputStream and provided a 
 replacement. Because they make no guarantees about compatibility about @Beta 
 classes, they removed it in version 15.
 What should be done: Hadoop should updated its dependency on Guava to at 
 least version 14 (currently Guava is on version 19). This should have little 
 impact on users, because Guava is so stable.
 HDFS should then be patched to use the provided alternative to 
 LimitInputStream, so that downstream packagers, users, and application 
 developers requiring more recent versions of Guava (to fix bugs, to use new 
 features, etc.) will be able to swap out the Guava dependency without 
 breaking Hadoop.
 Alternative: While Hadoop cannot predict the marking and removal of 
 deprecated code, it can, and should, avoid the use of @Beta classes and 
 methods that do not offer guarantees. If the dependency cannot be bumped, 
 then it should be relatively trivial to provide an internal class with the 
 same functionality, that does not rely on the older version of Guava.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202890#comment-14202890
 ] 

Hadoop QA commented on HDFS-7358:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680257/h7358_20141107.patch
  against trunk revision 06b7979.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.balancer.TestBalancer

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8693//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8693//console

This message is automatically generated.

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch, h7358_20141107.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-07 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-7381:


 Summary: Decouple the management of block id and gen stamps from 
FSNamesystem
 Key: HDFS-7381
 URL: https://issues.apache.org/jira/browse/HDFS-7381
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


The block layer should be responsible of managing block ids and generation 
stamps. Currently the functionality is misplace into {{FSNamesystem}}.

This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7381:
-
Attachment: HDFS-7381.000.patch

 Decouple the management of block id and gen stamps from FSNamesystem
 

 Key: HDFS-7381
 URL: https://issues.apache.org/jira/browse/HDFS-7381
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7381.000.patch


 The block layer should be responsible of managing block ids and generation 
 stamps. Currently the functionality is misplace into {{FSNamesystem}}.
 This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202922#comment-14202922
 ] 

Haohui Mai commented on HDFS-7381:
--

The v1 patch creates a new class ({{BlockIdManager}} in the {{blockmanagement}} 
package) to manage the block ids and generation stamps. {{FSNamesystem}} is 
still responsible to persist the latest generation stamp and block id in the 
edit logs.

 Decouple the management of block id and gen stamps from FSNamesystem
 

 Key: HDFS-7381
 URL: https://issues.apache.org/jira/browse/HDFS-7381
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7381.000.patch


 The block layer should be responsible of managing block ids and generation 
 stamps. Currently the functionality is misplace into {{FSNamesystem}}.
 This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202894#comment-14202894
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7279:
---

Reviewing the patch.  Some questions/comments from the first half:

- In JspHelper.checkUsername(..), why removing the tryUgiParameter if-statement?

- In URLDispatcher.channelRead0(..), how about checking webhdfs uri first and 
then use SimpleHttpProxyHandler for every else?  I.e.
{code}
if (uri.startsWith(/webhdfs/v1)) {
  WebHdfsHandler h = new WebHdfsHandler(conf, confForCreate);
  p.replace(this, proxy, h);
  h.channelRead0(ctx, req);
} else {
  SimpleHttpProxyHandler h = new SimpleHttpProxyHandler(proxyHost);
  p.replace(this, proxy, h);
  h.channelRead0(ctx, req);
}
{code}

- DatanodeHttpServer.close() should throw IOException.  Then, we don't need to 
convert IOException to RuntimeException.  Also, do we want to distory ssl 
factory before closing the channel?  Or put it in finally?

- In SimpleHttpProxyHandler,
-* Forwarder.channelRead(..): should the two LOG.warn be LOG.debug?
-* Forwarder.exceptionCaught(..): should the LOG.info be LOG.warn/error?
-* channelRead0(..): should the LOG.info be LOG.warn/error?

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202938#comment-14202938
 ] 

Haohui Mai commented on HDFS-7279:
--

Updated the patch to address Nicholas's comments.

I cleaned up the usages of LOG {{SimpleHttpProxyHandler}} in the v8 patch. I 
kept the LOG at INFO level when an exception occurs. My intuition is that it is 
usually not a serious issue when this type of error happens, thus making them 
WARN might generate unnecessary noise. I  have no strong opinion on that, I'm 
okay to change it to WARN if you think it is more appropriate.

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch, 
 HDFS-7279.008.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7381:
-
Status: Patch Available  (was: Open)

 Decouple the management of block id and gen stamps from FSNamesystem
 

 Key: HDFS-7381
 URL: https://issues.apache.org/jira/browse/HDFS-7381
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7381.000.patch


 The block layer should be responsible of managing block ids and generation 
 stamps. Currently the functionality is misplace into {{FSNamesystem}}.
 This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7279:
-
Attachment: HDFS-7279.008.patch

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch, 
 HDFS-7279.008.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7358) Clients may get stuck waiting when using ByteArrayManager

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202953#comment-14202953
 ] 

Hadoop QA commented on HDFS-7358:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680257/h7358_20141107.patch
  against trunk revision 06b7979.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestHFlush

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8692//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8692//console

This message is automatically generated.

 Clients may get stuck waiting when using ByteArrayManager
 -

 Key: HDFS-7358
 URL: https://issues.apache.org/jira/browse/HDFS-7358
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7358_20141104.patch, h7358_20141104_wait_timeout.patch, 
 h7358_20141105.patch, h7358_20141106.patch, h7358_20141107.patch


 [~stack] reported that clients might get stuck waiting when using 
 ByteArrayManager; see [his 
 comments|https://issues.apache.org/jira/browse/HDFS-7276?focusedCommentId=14197036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14197036].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN

2014-11-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202971#comment-14202971
 ] 

Colin Patrick McCabe commented on HDFS-7314:


bq. It turns out a new bug not related to this was discovered by this change. 
If DataStreamer thread exit and closes the stream before application closes the 
stream, DFSClient will keep renewing the lease. That is because DataStreamer's 
closeInternal marks the stream closed but didn't call DFSClient's endFileLease. 
Later when application closes the stream, it will skip DFSClient's endFileLease 
given the stream has been closed.

You're right that there is a bug here.  There is a lot of discussion about what 
to do about this issue in HDFS-4504.  It's not as simple as just calling 
{{endFileLease}}... if we missed calling {{completeFile}}, the NN will continue 
to think that we have a lease open on this file.  I think we should avoid 
modifying {{DFSOutputStream#close}} here.  We should try to keep this JIRA 
focused on just the description.  Plus HDFS-4504 is a complex issue, not easy 
to solve.

{{TestDFSClientRetries.java}}: let's get rid of the unnecessary whitespace 
change in the current patch.

I like the idea of getting rid of the {{DFSClient#abort}} function.

The patch looks good once these things are removed, should be ready to go soon!

 Aborted DFSClient's impact on long running service like YARN
 

 Key: HDFS-7314
 URL: https://issues.apache.org/jira/browse/HDFS-7314
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
 HDFS-7314.patch


 It happened in YARN nodemanger scenario. But it could happen to any long 
 running service that use cached instance of DistrbutedFileSystem.
 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
 any DFSClient request will get ConnectTimeoutException.
 2. YARN nodemanager use DFSClient for certain write operation such as log 
 aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
 renewLease RPC got ConnectTimeoutException.
 {noformat}
 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
 renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
 Aborting ...
 {noformat}
 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
 instance of DistributedFileSystem.
 {noformat}
 2014-10-29 20:26:23,991 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Failed to download rsrc...
 java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
 Given the callstack is YARN - DistributedFileSystem - DFSClient, this can 
 be addressed at different layers.
 * YARN closes the DistributedFileSystem object when it receives some well 
 defined exception. Then the next HDFS call will create a new instance of 
 DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
 applications need to address this as well.
 * DistributedFileSystem detects Aborted DFSClient and create a new instance 
 of DFSClient. We will need to fix all the places DistributedFileSystem calls 
 DFSClient.
 * After DFSClient gets into Aborted state, it doesn't have to reject all 
 requests , instead it can retry. If NN is available again it can transition 
 to healthy state.
 Comments?



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202999#comment-14202999
 ] 

Colin Patrick McCabe commented on HDFS-6803:


I'm having trouble reconciling the idea that input streams are not 
thread-safe with the idea that multiple positional reads can be going on in 
parallel.  It seems like if clients are going to have multiple {{pread}} calls 
in flight, they are counting on thread-safety.  Maybe we can say that streams 
which implement {{PositionedReadable}} are thread-safe?

Are there still Hadoop FileSystem implementations out there that have input 
streams that are not thread-safe?  That seems like a recipe for broken code 
that runs on HDFS but not on other FSes.  It seems like if we do have any such 
FS implemntations, they could be fixed pretty easily by putting 
{{synchronized}} on the methods.

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203170#comment-14203170
 ] 

Hadoop QA commented on HDFS-7279:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680308/HDFS-7279.008.patch
  against trunk revision c3d4750.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.util.TestByteArrayManager
org.apache.hadoop.hdfs.TestParallelUnixDomainRead

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8695//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8695//console

This message is automatically generated.

 Use netty to implement DatanodeWebHdfsMethods
 -

 Key: HDFS-7279
 URL: https://issues.apache.org/jira/browse/HDFS-7279
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, 
 HDFS-7279.002.patch, HDFS-7279.003.patch, HDFS-7279.004.patch, 
 HDFS-7279.005.patch, HDFS-7279.006.patch, HDFS-7279.007.patch, 
 HDFS-7279.008.patch


 Currently the DN implements all related webhdfs functionality using jetty. As 
 the current jetty version the DN used (jetty 6) lacks of fine-grained buffer 
 and connection management, DN often suffers from long latency and OOM when 
 its webhdfs component is under sustained heavy load.
 This jira proposes to implement the webhdfs component in DN using netty, 
 which can be more efficient and allow more finer-grain controls on webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203171#comment-14203171
 ] 

Hadoop QA commented on HDFS-7381:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680304/HDFS-7381.000.patch
  against trunk revision c3d4750.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 1218 javac 
compiler warnings (more than the trunk's current 1217 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.util.TestByteArrayManager
org.apache.hadoop.hdfs.TestParallelUnixDomainRead

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8694//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8694//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8694//console

This message is automatically generated.

 Decouple the management of block id and gen stamps from FSNamesystem
 

 Key: HDFS-7381
 URL: https://issues.apache.org/jira/browse/HDFS-7381
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7381.000.patch


 The block layer should be responsible of managing block ids and generation 
 stamps. Currently the functionality is misplace into {{FSNamesystem}}.
 This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN

2014-11-07 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-7314:
--
Attachment: HDFS-7314-5.patch

Thanks, Colin. Didn't know lease leak is a known issue.

Here is the updated patch. Given the lease leak issue, LeaseRenewal can't 
rely on {{closeAllFilesBeingWritten}} to close all leases. So it has to call 
{{CloseClient}}.

{{testLeaseRenewSocketTimeout}} added to {{TestDFSClientRetries}} doesn't seem 
to have unnecessary whitespace. Do you mean newline? The updated patch has 
removed unnecessary newlines.

 Aborted DFSClient's impact on long running service like YARN
 

 Key: HDFS-7314
 URL: https://issues.apache.org/jira/browse/HDFS-7314
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
 HDFS-7314-5.patch, HDFS-7314.patch


 It happened in YARN nodemanger scenario. But it could happen to any long 
 running service that use cached instance of DistrbutedFileSystem.
 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
 any DFSClient request will get ConnectTimeoutException.
 2. YARN nodemanager use DFSClient for certain write operation such as log 
 aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
 renewLease RPC got ConnectTimeoutException.
 {noformat}
 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
 renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
 Aborting ...
 {noformat}
 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
 instance of DistributedFileSystem.
 {noformat}
 2014-10-29 20:26:23,991 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Failed to download rsrc...
 java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
 Given the callstack is YARN - DistributedFileSystem - DFSClient, this can 
 be addressed at different layers.
 * YARN closes the DistributedFileSystem object when it receives some well 
 defined exception. Then the next HDFS call will create a new instance of 
 DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
 applications need to address this as well.
 * DistributedFileSystem detects Aborted DFSClient and create a new instance 
 of DFSClient. We will need to fix all the places DistributedFileSystem calls 
 DFSClient.
 * After DFSClient gets into Aborted state, it doesn't have to reject all 
 requests , instead it can retry. If NN is available again it can transition 
 to healthy state.
 Comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7374:

Status: Patch Available  (was: Open)

 Allow decommissioning of dead DataNodes
 ---

 Key: HDFS-7374
 URL: https://issues.apache.org/jira/browse/HDFS-7374
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang

 We have seen the use case of decommissioning DataNodes that are already dead 
 or unresponsive, and not expected to rejoin the cluster.
 The logic introduced by HDFS-6791 will mark those nodes as 
 {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
 the decommission work. If an upper layer application is monitoring the 
 decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7374:

Attachment: HDFS-7374-001.patch

Thanks [~mingma] again for the comment.

This patch implements option #2. It also moved an utility function to 
{{DFSTestUtil}} so it's accessible in the new unit test.

 Allow decommissioning of dead DataNodes
 ---

 Key: HDFS-7374
 URL: https://issues.apache.org/jira/browse/HDFS-7374
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-7374-001.patch


 We have seen the use case of decommissioning DataNodes that are already dead 
 or unresponsive, and not expected to rejoin the cluster.
 The logic introduced by HDFS-6791 will mark those nodes as 
 {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
 the decommission work. If an upper layer application is monitoring the 
 decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7374) Allow decommissioning of dead DataNodes

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203257#comment-14203257
 ] 

Hadoop QA commented on HDFS-7374:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680362/HDFS-7374-001.patch
  against trunk revision 9a4e0d3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8697//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8697//console

This message is automatically generated.

 Allow decommissioning of dead DataNodes
 ---

 Key: HDFS-7374
 URL: https://issues.apache.org/jira/browse/HDFS-7374
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-7374-001.patch


 We have seen the use case of decommissioning DataNodes that are already dead 
 or unresponsive, and not expected to rejoin the cluster.
 The logic introduced by HDFS-6791 will mark those nodes as 
 {{DECOMMISSION_INPROGRESS}}, with a hope that they can come back and finish 
 the decommission work. If an upper layer application is monitoring the 
 decommissioning progress, it will hang forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7314) Aborted DFSClient's impact on long running service like YARN

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203256#comment-14203256
 ] 

Hadoop QA commented on HDFS-7314:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680355/HDFS-7314-5.patch
  against trunk revision 4a114dd.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8696//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8696//console

This message is automatically generated.

 Aborted DFSClient's impact on long running service like YARN
 

 Key: HDFS-7314
 URL: https://issues.apache.org/jira/browse/HDFS-7314
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
 HDFS-7314-5.patch, HDFS-7314.patch


 It happened in YARN nodemanger scenario. But it could happen to any long 
 running service that use cached instance of DistrbutedFileSystem.
 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
 any DFSClient request will get ConnectTimeoutException.
 2. YARN nodemanager use DFSClient for certain write operation such as log 
 aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
 renewLease RPC got ConnectTimeoutException.
 {noformat}
 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
 renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
 Aborting ...
 {noformat}
 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
 instance of DistributedFileSystem.
 {noformat}
 2014-10-29 20:26:23,991 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Failed to download rsrc...
 java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
 Given the callstack is YARN - DistributedFileSystem - DFSClient, this can 
 be addressed at different layers.
 * YARN closes the DistributedFileSystem object when it receives some well 
 defined exception. Then the next HDFS call will create a new instance of 
 DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
 applications need to address this as well.
 * DistributedFileSystem detects Aborted DFSClient and create a new instance 
 of DFSClient. We will need to fix all the places DistributedFileSystem calls 
 DFSClient.
 * After DFSClient gets into Aborted state, it doesn't have to reject all 
 requests , instead it can