date:20140314


[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934872#comment-13934872
 ] 

Hudson commented on HDFS-6080:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #509 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/509/])
HDFS-6080. Improve NFS gateway performance by making rtmax and wtmax 
configurable. Contributed by Abin Shahab (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577319)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm


 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Fix For: 2.4.0

 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, 
 HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934878#comment-13934878
 ] 

Hudson commented on HDFS-6097:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #509 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/509/])
HDFS-6097. Zero-copy reads are incorrectly disabled on file offsets above 2GB 
(cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577350)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java


 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934875#comment-13934875
 ] 

Hudson commented on HDFS-5244:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #509 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/509/])
HDFS-5244. TestNNStorageRetentionManager#testPurgeMultipleDirs fails. 
Contributed bye Jinghui Wang. (suresh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577254)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Lower the default maximum items per directory to fix PB fsimage loading


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934880#comment-13934880
 ] 

Hudson commented on HDFS-6102:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #509 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/509/])
HDFS-6102. Lower the default maximum items per directory to fix PB fsimage 
loading. Contributed by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577426)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/fsimage.proto
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java


 Lower the default maximum items per directory to fix PB fsimage loading
 ---

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Fix For: 2.4.0

 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934877#comment-13934877
 ] 

Hudson commented on HDFS-6084:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #509 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/509/])
HDFS-6084. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage. 
Contributed by Travis Thompson. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577401)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html


 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.

2014-03-14 Thread Miodrag Radulovic (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miodrag Radulovic updated HDFS-5516:


Attachment: HDFS-5516.patch

Fixed encoding of the patch file

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch, HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.

2014-03-14 Thread Miodrag Radulovic (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934923#comment-13934923
 ] 

Miodrag Radulovic commented on HDFS-5516:
-

Ok, I will submit the patch for branch-1, later today.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch, HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Guo Ruijing (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guo Ruijing updated HDFS-6087:
--

Attachment: HDFS Design Proposal_3_14.pdf

 Unify HDFS write/append/truncate
 

 Key: HDFS-6087
 URL: https://issues.apache.org/jira/browse/HDFS-6087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Guo Ruijing
 Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf


 In existing implementation, HDFS file can be appended and HDFS block can be 
 reopened for append. This design will introduce complexity including lease 
 recovery. If we design HDFS block as immutable, it will be very simple for 
 append  truncate. The idea is that HDFS block is immutable if the block is 
 committed to namenode. If the block is not committed to namenode, it is HDFS 
 client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6104) TestFsLimits#testDefaultMaxComponentLength Fails on branch-2

2014-03-14 Thread Mit Desai (JIRA)

Mit Desai created HDFS-6104:
---

 Summary: TestFsLimits#testDefaultMaxComponentLength Fails on 
branch-2
 Key: HDFS-6104
 URL: https://issues.apache.org/jira/browse/HDFS-6104
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Mit Desai
Assignee: Mit Desai


testDefaultMaxComponentLength fails intermittently with the following error
{noformat}
java.lang.AssertionError: expected:0 but was:255
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.hadoop.hdfs.server.namenode.TestFsLimits.testDefaultMaxComponentLength(TestFsLimits.java:90)
{noformat}

On doing some research, I found that this is actually a JDK7 issue.
The test always fails when it runs after any test that runs addChildWithName() 
method



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Guo Ruijing (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935017#comment-13935017
 ] 

Guo Ruijing commented on HDFS-6087:
---

update new document according to Konstantin's comments

 Unify HDFS write/append/truncate
 

 Key: HDFS-6087
 URL: https://issues.apache.org/jira/browse/HDFS-6087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Guo Ruijing
 Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf


 In existing implementation, HDFS file can be appended and HDFS block can be 
 reopened for append. This design will introduce complexity including lease 
 recovery. If we design HDFS block as immutable, it will be very simple for 
 append  truncate. The idea is that HDFS block is immutable if the block is 
 committed to namenode. If the block is not committed to namenode, it is HDFS 
 client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935029#comment-13935029
 ] 

Hudson commented on HDFS-5244:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1701 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1701/])
HDFS-5244. TestNNStorageRetentionManager#testPurgeMultipleDirs fails. 
Contributed bye Jinghui Wang. (suresh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577254)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935026#comment-13935026
 ] 

Hudson commented on HDFS-6080:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1701 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1701/])
HDFS-6080. Improve NFS gateway performance by making rtmax and wtmax 
configurable. Contributed by Abin Shahab (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577319)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm


 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Fix For: 2.4.0

 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, 
 HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935031#comment-13935031
 ] 

Hudson commented on HDFS-6084:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1701 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1701/])
HDFS-6084. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage. 
Contributed by Travis Thompson. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577401)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html


 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935032#comment-13935032
 ] 

Hudson commented on HDFS-6097:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1701 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1701/])
HDFS-6097. Zero-copy reads are incorrectly disabled on file offsets above 2GB 
(cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577350)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java


 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Lower the default maximum items per directory to fix PB fsimage loading


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935034#comment-13935034
 ] 

Hudson commented on HDFS-6102:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1701 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1701/])
HDFS-6102. Lower the default maximum items per directory to fix PB fsimage 
loading. Contributed by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577426)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/fsimage.proto
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java


 Lower the default maximum items per directory to fix PB fsimage loading
 ---

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Fix For: 2.4.0

 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.

2014-03-14 Thread Miodrag Radulovic (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miodrag Radulovic updated HDFS-5516:


Attachment: HDFS-5516-branch-1.patch

Fix for the branch-1.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516-branch-1.patch, HDFS-5516.patch, 
 HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.


[ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935069#comment-13935069
 ] 

Hadoop QA commented on HDFS-5516:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634699/HDFS-5516.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6403//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6403//console

This message is automatically generated.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516-branch-1.patch, HDFS-5516.patch, 
 HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.


[ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935074#comment-13935074
 ] 

Hadoop QA commented on HDFS-5516:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12634720/HDFS-5516-branch-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6404//console

This message is automatically generated.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516-branch-1.patch, HDFS-5516.patch, 
 HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold


 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6094:


Status: Patch Available  (was: Open)

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935108#comment-13935108
 ] 

Hudson commented on HDFS-5244:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1726 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1726/])
HDFS-5244. TestNNStorageRetentionManager#testPurgeMultipleDirs fails. 
Contributed bye Jinghui Wang. (suresh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577254)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Lower the default maximum items per directory to fix PB fsimage loading


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935113#comment-13935113
 ] 

Hudson commented on HDFS-6102:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1726 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1726/])
HDFS-6102. Lower the default maximum items per directory to fix PB fsimage 
loading. Contributed by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577426)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/fsimage.proto
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java


 Lower the default maximum items per directory to fix PB fsimage loading
 ---

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Fix For: 2.4.0

 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935105#comment-13935105
 ] 

Hudson commented on HDFS-6080:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1726 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1726/])
HDFS-6080. Improve NFS gateway performance by making rtmax and wtmax 
configurable. Contributed by Abin Shahab (brandonli: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577319)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm


 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Fix For: 2.4.0

 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, 
 HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935111#comment-13935111
 ] 

Hudson commented on HDFS-6097:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1726 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1726/])
HDFS-6097. Zero-copy reads are incorrectly disabled on file offsets above 2GB 
(cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577350)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java


 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935110#comment-13935110
 ] 

Hudson commented on HDFS-6084:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1726 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1726/])
HDFS-6084. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage. 
Contributed by Travis Thompson. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577401)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html


 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6105) NN web UI for DN list loads the same jmx page three times.

Kihwal Lee created HDFS-6105:


 Summary: NN web UI for DN list loads the same jmx page three times.
 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee


When loading Datanodes page of the NN web UI, the same jmx query is made 
three times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.

2014-03-14 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935144#comment-13935144
 ] 

Chris Nauroth commented on HDFS-6099:
-

The failure in {{TestBalancer}} is unrelated.

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.4.0

 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.


 [ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6105:
-

Summary: NN web UI for DN list loads the same jmx page multiple times.  
(was: NN web UI for DN list loads the same jmx page three times.)

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 three times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.


[ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935154#comment-13935154
 ] 

Kihwal Lee commented on HDFS-6105:
--

It was pointed out in HDFS-5748 before.  

If you try reloading the page, it won't load multiple time. But if you click on 
the Datanodes tab, you will see multiple redundant GET being issued.  If you 
alternate between Overview and Datanodes, it gets worse. After going back and 
forth several times, I see it being loaded 9 times when clicking Datanodes.

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.


[ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935155#comment-13935155
 ] 

Kihwal Lee commented on HDFS-6105:
--

[~wheat9]: Would take a look at what's going on?

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.


 [ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6105:
-

Description: When loading Datanodes page of the NN web UI, the same jmx 
query is made multiple times. For a big cluster, that's a lot of data and 
overhead.  (was: When loading Datanodes page of the NN web UI, the same jmx 
query is made three times. For a big cluster, that's a lot of data and 
overhead.)

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage

2014-03-14 Thread Travis Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935169#comment-13935169
 ] 

Travis Thompson commented on HDFS-6084:
---

Thanks everyone

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-14 Thread Thanh Do (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935194#comment-13935194
]

Thanh Do commented on HDFS-6009:

Yu Li, thanks for your detailed comment! Your use case is a great example of
isolation. We are currently working on some similar problems but at a lower
level on the software stack, thus your use case is a great motivation.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6104) TestFsLimits#testDefaultMaxComponentLength Fails on branch-2

2014-03-14 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved HDFS-6104.
-

Resolution: Invalid
  Assignee: (was: Mit Desai)

Closing this issue because the test was removed by HDFS-6102.

 TestFsLimits#testDefaultMaxComponentLength Fails on branch-2
 

 Key: HDFS-6104
 URL: https://issues.apache.org/jira/browse/HDFS-6104
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Mit Desai
  Labels: java7

 testDefaultMaxComponentLength fails intermittently with the following error
 {noformat}
 java.lang.AssertionError: expected:0 but was:255
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestFsLimits.testDefaultMaxComponentLength(TestFsLimits.java:90)
 {noformat}
 On doing some research, I found that this is actually a JDK7 issue.
 The test always fails when it runs after any test that runs 
 addChildWithName() method



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.

2014-03-14 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935293#comment-13935293
 ] 

Haohui Mai commented on HDFS-6105:
--

Every time you click on the datenode tab, it'll reload 
{{/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo}} to get the up-to-date 
information of the datanode. This is expected.

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-14 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935296#comment-13935296
 ] 

Jing Zhao commented on HDFS-6094:
-

The patch looks good to me. One question is, currently NN adds info about a new 
datanode storage only when processing complete block report. Can we also do 
this for IBR?

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935303#comment-13935303
 ] 

Hadoop QA commented on HDFS-6094:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634642/HDFS-6904.01.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6405//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6405//console

This message is automatically generated.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads


[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935312#comment-13935312
 ] 

Colin Patrick McCabe commented on HDFS-6007:


Looks good, I think we're getting close.

{code}
+  Legacy short-circuit local reads implementation
+  on which the clients directly open the HDFS block files is still available
+  for the platforms other than Linux.
{code}

Missing the

{code}
+  Because Legacy short-circuit local reads is insecure,
+  access to this feature is limited to the users listed in
+  the value of dfs.block.local-path-access.user.
{code}

I think this section needs to be moved after the section about 
dfs.datanode.data.dir.perm.  Otherwise it's not clear why the legacy SCR is 
insecure.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch, HDFS-6007-4.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-14 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935327#comment-13935327
 ] 

Aaron T. Myers commented on HDFS-5840:
--

Sorry, got swamped this week. Will try to get to it early next.

 Follow-up to HDFS-5138 to improve error handling during partial upgrade 
 failures
 

 Key: HDFS-5840
 URL: https://issues.apache.org/jira/browse/HDFS-5840
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 3.0.0

 Attachments: HDFS-5840.patch


 Suresh posted some good comment in HDFS-5138 after that patch had already 
 been committed to trunk. This JIRA is to address those. See the first comment 
 of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5997) TestHASafeMode#testBlocksAddedWhileStandbyIsDown fails in trunk

[
https://issues.apache.org/jira/browse/HDFS-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935342#comment-13935342
]

Arpit Agarwal commented on HDFS-5997:
-

Thanks for reporting this [~yuzhih...@gmail.com]. I missed it when filing
HDFS-6094.

Jing or I will post an updated patch for it soon, if either of you have a
consistent repro it would be great if you can also help verify.

TestHASafeMode#testBlocksAddedWhileStandbyIsDown fails in trunk
---

Key: HDFS-5997
URL: https://issues.apache.org/jira/browse/HDFS-5997
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Ted Yu

From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1681/ :
REGRESSION:
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown
Error Message:
{code}
Bad safemode status: 'Safe mode is ON. The reported blocks 7 has reached the
threshold 0.9990 of total blocks 6. The number of live datanodes 3 has
reached the minimum number 0. Safe mode will be turned off automatically in
28 seconds.'
{code}
Stack Trace:
{code}
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported
blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of
live datanodes 3 has reached the minimum number 0. Safe mode will be turned
off automatically in 28 seconds.'
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
at
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
{code}

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode

2014-03-14 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935411#comment-13935411
 ] 

Jing Zhao commented on HDFS-6100:
-

The patch looks pretty good to me. Some minor comments:
# In DatanodeWebHdfsMethods, the current patch has some inconsistent field name 
for the NamenodeAddressParam parameter (nnId, namenodeId, and 
namenodeRpcAddress). How about just calling them namenode since it can be 
either NameService ID or NameNode RPC address?
# Nit: the following code needs some reformat:
{code}
tokenServiceName = HAUtil.isHAEnabled(conf,
nsId) ? nsId : NetUtils.getHostPortString
(rpcServer.getRpcAddress());
{code}
# In the new unit test, we can add some extra check about the content of the 
new created file. Also, maybe we can try to transition the second NN to active 
first so that the first create call can also hit a failover.
# Looks like the patch also fixes the token service name in HA setup for 
webhdfs. Please update the description of the jira.
# Could you also post your system test results (HA, non-HA, secured, insecure 
setup etc.)?

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to 
 the NN, so that it can access the files in the cluster. 
 {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate 
 the NN. Currently the parameter is set by the NN and it is a host-ip pair, 
 which does not support HA.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-14 Thread Masatake Iwasaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-6007:
---

Attachment: HDFS-6007-5.patch

attaching the updated patch.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch, HDFS-6007-4.patch, HDFS-6007-5.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.


[ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935440#comment-13935440
 ] 

Kihwal Lee commented on HDFS-6105:
--

 This is expected.
Please read what I said earlier. One click causes multiple loads of the same 
page.

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6090) Use MiniDFSCluster.Builder instead of deprecated constructors

2014-03-14 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned HDFS-6090:
---

Assignee: Akira AJISAKA

 Use MiniDFSCluster.Builder instead of deprecated constructors
 -

 Key: HDFS-6090
 URL: https://issues.apache.org/jira/browse/HDFS-6090
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie

 Some test classes are using deprecated constructors such as 
 {{MiniDFSCluster(Configuration, int, boolean, String[], String[])}} for 
 building a MiniDFSCluster.
 These classes should use {{MiniDFSCluster.Builder}} to reduce javac warnings 
 and improve code readability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold

[
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935500#comment-13935500
]

Arpit Agarwal commented on HDFS-6094:
-

Jing, I think it is a good idea to learn about storages from the IBR.

One issue with doing so is that the storage type and state are not known while
processing the IBR. We can assume some defaults but this can lead to bugs since
the type and state can be used to make replication decisions. I think we need
to enhance the incremental report protocol to send the storage type and state
along with the storage ID. Then we can safely create a new storage entry. For
protocol compatibility we can assume defaults if the type and state are not
provided. I am going to code up the patch.

Thanks for the ideas!

The same block can be counted twice towards safe mode threshold
---

Key: HDFS-6094
URL: https://issues.apache.org/jira/browse/HDFS-6094
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt

{{BlockManager#addStoredBlock}} can cause the same block can be counted
towards safe mode threshold. We see this manifest via
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More
details to follow in a comment.
Exception details:
{code}
Time elapsed: 12.874 sec FAILURE!
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported
blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of
live datanodes 3 has reached the minimum number 0. Safe mode will be turned
off automatically in 28 seconds.'
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
at
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
{code}

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.

2014-03-14 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935563#comment-13935563
 ] 

Haohui Mai commented on HDFS-6105:
--

I can't reproduce the bug. What browser you're using?

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee

 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6105) NN web UI for DN list loads the same jmx page multiple times.

2014-03-14 Thread Travis Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Thompson updated HDFS-6105:
--

Attachment: datanodes-tab.png

I can reproduce it in 2.3.0.  If I open the NN page and (with the Firefox 
console open) I click on the Datanodes tab, 3 GETs are sent to http://nn/jmx 
very quickly.  I've attached an image of the Firefox console.  Using Firefox 
27.0.1 on Mac OS 1.8.5

 NN web UI for DN list loads the same jmx page multiple times.
 -

 Key: HDFS-6105
 URL: https://issues.apache.org/jira/browse/HDFS-6105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Kihwal Lee
 Attachments: datanodes-tab.png


 When loading Datanodes page of the NN web UI, the same jmx query is made 
 multiple times. For a big cluster, that's a lot of data and overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6088) Add configurable maximum block count for datanode

[
https://issues.apache.org/jira/browse/HDFS-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935638#comment-13935638
]

Kihwal Lee commented on HDFS-6088:
--

bq. Would be nice to avoid having yet another config that users have to set.
I agree.

I was looking at the heap usage of a DN. It looks like the heap usage has
dropped considerably since we moved to use GSet for block map. So much so that
the automatically defined GSet capacity doesn't seem to be sufficient. For
example, I brought up a DN with about 62K blocks with the max heap set to 1GB.
The GSet was created for 524,288 entries.

Looking at the heap usage, each block takes up about 315 bytes. Other parts
take up less than 50MB. In any case, 315 * 524288 = 157MB. Even if other parts
take up more than expected, the node can easily store 4X of this. But storing
2M entries in the small GSet is not ideal.

Add configurable maximum block count for datanode
-

Key: HDFS-6088
URL: https://issues.apache.org/jira/browse/HDFS-6088
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Kihwal Lee

Currently datanode resources are protected by the free space check and the
balancer. But datanodes can run out of memory simply storing too many
blocks. If the sizes of blocks are small, datanodes will appear to have
plenty of space to put more blocks.
I propose adding a configurable max block count to datanode. Since datanodes
can have different heap configurations, it will make sense to make it
datanode-level, rather than something enforced by namenode.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935686#comment-13935686
]

Konstantin Shvachko commented on HDFS-6087:
---

Based on what you write, I see two main problems with your approach.
# A block cannot be read by others while under construction, until it is fully
written and committed.
That would be a step back. Making UC-blocks readable was one of the append
design requirements (see HDFS-265 and preceding work). If a slow client writes
to a block 1KB/min others will have to wait for hours until they can see the
progress on the file.
# Your proposal (if I understand it correctly) will potentially lead to a lot
of small blocks if appends, fscyncs (and truncates) are used intensively.
Say, in order to overcome problem (1) I write my application so that it closes
the file after each 1KB written and reopens for append one minute later. You
get lots of 1KB blocks. And small blocks are bad for the NameNode as we know.

Unify HDFS write/append/truncate

Key: HDFS-6087
URL: https://issues.apache.org/jira/browse/HDFS-6087
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client
Reporter: Guo Ruijing
Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf

In existing implementation, HDFS file can be appended and HDFS block can be
reopened for append. This design will introduce complexity including lease
recovery. If we design HDFS block as immutable, it will be very simple for
append truncate. The idea is that HDFS block is immutable if the block is
committed to namenode. If the block is not committed to namenode, it is HDFS
client’s responsibility to re-added with new block ID.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads


[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935668#comment-13935668
 ] 

Hadoop QA commented on HDFS-6007:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634789/HDFS-6007-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6406//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6406//console

This message is automatically generated.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch, HDFS-6007-4.patch, HDFS-6007-5.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold


 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6094:


Attachment: HDFS-6094.03.patch

Update patch with Jing's suggestion.

To do this right required some additions to the {{DatanodeProtocol}} and some 
corresponding changes within the DataNode. Protocol changes are wire compatible.

Jenkins will flag some new warnings for using deprecated APIs which is 
expected. The usages in the protobuf translators are required for wire 
compatibility and the remaining usages are in a couple of tests and in 
{{NNThroughputBenchmark}} which we can update later.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6094.03.patch, HDFS-6904.01.patch, 
 TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Tsz Wo Nicholas Sze (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935707#comment-13935707
]

Tsz Wo Nicholas Sze commented on HDFS-6087:
---

1. A block cannot be read by others while under construction, until it is
fully written and committed. ...

It also does not support hflush.

2. Your proposal (if I understand it correctly) will potentially lead to a
lot of small blocks if appends, fscyncs (and truncates) are used intensively.
...

I guess it won't lead to a lot of small block since it does copy-on-write.
However, there is going to be a lot of block coping if there are a lot of
append, hsync, etc.

In addition, I think it would be a problem for reading the last block: If a
reader opens a file and reads the last block slowly, then a writer reopen the
file for append and committed the new last block. The old last block may then
be deleted and becomes not available to the read anymore.

Unify HDFS write/append/truncate

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935723#comment-13935723
 ] 

Colin Patrick McCabe commented on HDFS-6093:


This looks good overall.  I think rather than protect 
{{CacheReplicationMonitor#numPendingCaching}} with the FSN lock, it would be 
better to make it an Atomic64 that we swap in at the end of the rescan.  That 
way we're not baking in the assumption that the rescan thread holds the FSN 
lock for the whole duration of the rescan.  It also would minimize the time we 
spend blocking waiting for the FSN lock in the MBean stuff.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-14 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935727#comment-13935727
 ] 

Konstantin Shvachko commented on HDFS-6087:
---

If it does copy-on-write, then the block is not immutable, at least in the 
sense I understand the term.

 Unify HDFS write/append/truncate
 

 Key: HDFS-6087
 URL: https://issues.apache.org/jira/browse/HDFS-6087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Guo Ruijing
 Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf


 In existing implementation, HDFS file can be appended and HDFS block can be 
 reopened for append. This design will introduce complexity including lease 
 recovery. If we design HDFS block as immutable, it will be very simple for 
 append  truncate. The idea is that HDFS block is immutable if the block is 
 committed to namenode. If the block is not committed to namenode, it is HDFS 
 client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-14 Thread jun aoki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935747#comment-13935747
 ] 

jun aoki commented on HDFS-6103:


Hi [~vinayrpet], I got the error message when I executed
{code}
sudo service hadoop-hdfs-namenode start
{code}

Then I found that I'd have to execute
{code}
sudo service hadoop-hdfs-namenode upgrade #(1)
{code}
Note that this does not have a hyphen e.g. -upgrade
I also have found that users can execute hadoop-daemon.sh. I've never tried it 
this way but something like
{code}
hadoop-daemon.sh --config /etc/hadoop start namenode -upgrade # (2)
{code}
Then this will require a hyphen.
I thought (1) is a preferred way thus this ticket, but if I'm wrong and (2) is 
equally or more preferred, please let me know.




 FSImage file system image version check throw a (slightly) wrong parameter.
 ---

 Key: HDFS-6103
 URL: https://issues.apache.org/jira/browse/HDFS-6103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: jun aoki
Priority: Trivial

 Trivial error message issue:
 When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start 
 namenode with upgrade option.
 e.g. 
 {code}
 sudo service namenode upgrade
 {code}
 That said, the actual error while without the option said -upgrade (with a 
 hyphen) 
 {code}
 2014-03-13 23:38:15,488 FATAL 
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
 java.io.IOException:
 File system image contains an old layout version -40.
 An upgrade to version -47 is required.
 Please restart NameNode with -upgrade option.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
 status 1
 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
 /
 ~
 {code}
 I'm referring to 2.0.5 above, 
 https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225
 I haven't tried the trunk but it seems to return UPGRADE (all upper case) 
 which again anther slightly wrong error description.
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935758#comment-13935758
 ] 

Arpit Agarwal edited comment on HDFS-6093 at 3/14/14 10:45 PM:
---

Hi Andrew,

I just tried this out your patch and I think there is some mismatch between the 
output of {{dfsAdmin -report}} and {{cacheadmin -listPools}}.

This is with a single NN/single DN pseudocluster on Centos 6.5.

I ran the following commands:
- bin/hdfs cacheadmin -addPool pool1 -limit 1073741824
- bin/hdfs cacheadmin -addDirective -path /f1 -pool pool1

This says FILES_CACHED is zero.
{code}
$ bin/hdfs cacheadmin -listPools -stats
Found 1 result.
NAME   OWNER GROUP MODE LIMIT  MAXTTL  BYTES_NEEDED  
BYTES_CACHED  BYTES_OVERLIMIT  FILES_NEEDED  FILES_CACHED
pool1  aagarwal  aagarwal  rwxr-xr-x   1073741824   never   1048576 
00 1 0
{code}

However this says cache used is 1MB. 
{code}
$ bin/hdfs dfsadmin -report
Configured Capacity: 49202208768 (45.82 GB)
Present Capacity: 39676268544 (36.95 GB)
DFS Remaining: 39675179008 (36.95 GB)
DFS Used: 1089536 (1.04 MB)
DFS Used%: 0.00%

Configured Cache Capacity: 268435456 (256 MB)
Present Cache Capacity: 268435456 (256 MB)
Cache Remaining: 267386880 (255 MB)
Cache Used: 1048576 (1 MB)
Cache Used%: 0.39%
{code}

I did not see any error messages related to caching in the DN/NN logs.


was (Author: arpitagarwal):
Hi Andrew,

I just tried this out your patch and I think there is some mismatch between the 
output of {{dfsAdmin -report}} and {{cacheadmin -listPools}}.

This is with a single NN/single DN pseudocluster on Centos 6.5.

I ran the following commands:
- bin/hdfs cacheadmin -addPool pool1 -limit 1073741824
- bin/hdfs cacheadmin -addDirective -path /f1 -pool pool1

This says FILES_CACHED is zero.
{code}
$ bin/hdfs cacheadmin -listPools -stats
Found 1 result.
NAME   OWNER GROUP MODE LIMIT  MAXTTL  BYTES_NEEDED  
BYTES_CACHED  BYTES_OVERLIMIT  FILES_NEEDED  FILES_CACHED
pool1  aagarwal  aagarwal  rwxr-xr-x   1073741824   never   1048576 
00 1 0
{code}

However this says cache used is 1MB. 
{code}
aagarwal@arrow ~/deploy2/hadoop-3.0.0-SNAPSHOT$ bin/hdfs dfsadmin -report
Configured Capacity: 49202208768 (45.82 GB)
Present Capacity: 39676268544 (36.95 GB)
DFS Remaining: 39675179008 (36.95 GB)
DFS Used: 1089536 (1.04 MB)
DFS Used%: 0.00%

Configured Cache Capacity: 268435456 (256 MB)
Present Cache Capacity: 268435456 (256 MB)
Cache Remaining: 267386880 (255 MB)
Cache Used: 1048576 (1 MB)
Cache Used%: 0.39%
{code}

I did not see any error messages related to caching in the DN/NN logs.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935758#comment-13935758
 ] 

Arpit Agarwal commented on HDFS-6093:
-

Hi Andrew,

I just tried this out your patch and I think there is some mismatch between the 
output of {{dfsAdmin -report}} and {{cacheadmin -listPools}}.

This is with a single NN/single DN pseudocluster on Centos 6.5.

I ran the following commands:
- bin/hdfs cacheadmin -addPool pool1 -limit 1073741824
- bin/hdfs cacheadmin -addDirective -path /f1 -pool pool1

This says FILES_CACHED is zero.
{code}
$ bin/hdfs cacheadmin -listPools -stats
Found 1 result.
NAME   OWNER GROUP MODE LIMIT  MAXTTL  BYTES_NEEDED  
BYTES_CACHED  BYTES_OVERLIMIT  FILES_NEEDED  FILES_CACHED
pool1  aagarwal  aagarwal  rwxr-xr-x   1073741824   never   1048576 
00 1 0
{code}

However this says cache used is 1MB. 
{code}
aagarwal@arrow ~/deploy2/hadoop-3.0.0-SNAPSHOT$ bin/hdfs dfsadmin -report
Configured Capacity: 49202208768 (45.82 GB)
Present Capacity: 39676268544 (36.95 GB)
DFS Remaining: 39675179008 (36.95 GB)
DFS Used: 1089536 (1.04 MB)
DFS Used%: 0.00%

Configured Cache Capacity: 268435456 (256 MB)
Present Cache Capacity: 268435456 (256 MB)
Cache Remaining: 267386880 (255 MB)
Cache Used: 1048576 (1 MB)
Cache Used%: 0.39%
{code}

I did not see any error messages related to caching in the DN/NN logs.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935771#comment-13935771
 ] 

Colin Patrick McCabe commented on HDFS-6093:


Hi Aprit,

It takes time for the values reported by dfsAdmin -report and cacheadmin 
-listPools to converge, since dfsAdmin comes from information taken from the DN 
heartbeat, and listPools comes from information taken from the 
CacheReplicationMonitor.  Try waiting 5 or 10 minutes.  We might want to 
shorten the default for {{dfs.namenode.path.based.cache.retry.interval.ms}} for 
this reason.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users

2014-03-14 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935770#comment-13935770
 ] 

Andrew Wang commented on HDFS-6093:
---

Hey Arpit,

So the confusing thing about these stats is how the pool and directive stats 
are only updated when the CacheReplicationMonitor runs (default every 5 mins). 
The datanode-level stats are updated on the heartbeat, so much more frequent. I 
think if you wait for a CRM run, it'll then show up in listPools.

I was considering lowering the default CRM interval for this reason, maybe to 1 
min or 30s, for this reason.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935777#comment-13935777
 ] 

Colin Patrick McCabe commented on HDFS-6093:


bq. I was considering lowering the default CRM interval for this reason, maybe 
to 1 min or 30s, for this reason.

Yeah, maybe we should set it to 30 seconds for now to get a better user 
experience.  We can always raise it if a performance issues emerges on a big 
cluster.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-14 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935784#comment-13935784
 ] 

Akira AJISAKA commented on HDFS-6103:
-

Hi [~jaoki], what distribution of Hadoop are you using?
AFAIK, service scripts are not provided in Apache Hadoop itself, so (2) is 
preferred if you are using community version.

 FSImage file system image version check throw a (slightly) wrong parameter.
 ---

 Key: HDFS-6103
 URL: https://issues.apache.org/jira/browse/HDFS-6103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: jun aoki
Priority: Trivial

 Trivial error message issue:
 When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start 
 namenode with upgrade option.
 e.g. 
 {code}
 sudo service namenode upgrade
 {code}
 That said, the actual error while without the option said -upgrade (with a 
 hyphen) 
 {code}
 2014-03-13 23:38:15,488 FATAL 
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
 java.io.IOException:
 File system image contains an old layout version -40.
 An upgrade to version -47 is required.
 Please restart NameNode with -upgrade option.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
 status 1
 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
 /
 ~
 {code}
 I'm referring to 2.0.5 above, 
 https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225
 I haven't tried the trunk but it seems to return UPGRADE (all upper case) 
 which again anther slightly wrong error description.
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935791#comment-13935791
 ] 

Colin Patrick McCabe commented on HDFS-6093:


sorry, meant to write dfs.namenode.path.based.cache.refresh.interval.ms

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935797#comment-13935797
 ] 

Colin Patrick McCabe commented on HDFS-6093:


I filed HDFS-6106 to reduce the defaults a bit.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms


 [ 
https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6106:
---

Attachment: HDFS-6106.001.patch

 Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
 

 Key: HDFS-6106
 URL: https://issues.apache.org/jira/browse/HDFS-6106
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6106.001.patch


 Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} 
 to improve the responsiveness of caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms


 [ 
https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe reassigned HDFS-6106:
--

Assignee: Colin Patrick McCabe

 Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
 

 Key: HDFS-6106
 URL: https://issues.apache.org/jira/browse/HDFS-6106
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6106.001.patch


 Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} 
 to improve the responsiveness of caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms


 [ 
https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6106:
---

Status: Patch Available  (was: Open)

 Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
 

 Key: HDFS-6106
 URL: https://issues.apache.org/jira/browse/HDFS-6106
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6106.001.patch


 Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} 
 to improve the responsiveness of caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms

Colin Patrick McCabe created HDFS-6106:
--

 Summary: Reduce default for 
dfs.namenode.path.based.cache.refresh.interval.ms
 Key: HDFS-6106
 URL: https://issues.apache.org/jira/browse/HDFS-6106
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
 Attachments: HDFS-6106.001.patch

Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} to 
improve the responsiveness of caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms

2014-03-14 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935816#comment-13935816
 ] 

Andrew Wang commented on HDFS-6106:
---

+1 pending

 Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
 

 Key: HDFS-6106
 URL: https://issues.apache.org/jira/browse/HDFS-6106
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6106.001.patch


 Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} 
 to improve the responsiveness of caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935859#comment-13935859
 ] 

Hadoop QA commented on HDFS-6094:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634845/HDFS-6094.03.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1540 javac 
compiler warnings (more than the trunk's current 1531 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6407//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6407//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6407//console

This message is automatically generated.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6094.03.patch, HDFS-6904.01.patch, 
 TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users


[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935858#comment-13935858
 ] 

Arpit Agarwal commented on HDFS-6093:
-

Thanks Andrew/Colin, the values did converge! 

wrt the patch:
# In addition to reducing the timeout as you suggested, can we add some 
explanation to the command output, or update CentralizedCacheManagement.html in 
the docs? Additionally does it make sense to display the pending 
caching/uncaching counts in the output of 'dfsadmin -report'? This would make 
it clear right away that there are some pending cache operations.
# {{CacheReplicationMonitor#rescan}} resets the counters to zero outside the 
write lock. It should be moved inside the lock else readers might see blips 
with the counters intermittently going to zero.
# Was {{stillPendingUncached}} introduced to fix a bug?

Minor code style comment: {{getPendingCachingCount}} can be condensed to 
{code}
return (monitor != null ? monitor.getPendingCachingCount() : 0);
{code}

Same with {{getPendingUncachingCount}}.

Change looks good otherwise.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935862#comment-13935862
 ] 

Arpit Agarwal commented on HDFS-6094:
-

The warnings are expected due to new deprecations. We can fix the test warnings 
later.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6094.03.patch, HDFS-6904.01.patch, 
 TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-14 Thread jun aoki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935892#comment-13935892
 ] 

jun aoki commented on HDFS-6103:


Hi [~ajisakaa], thank you for clarifying. I'm using bigtop. 
Let's focus on StartupOption.UPGRADE in this ticket.

 FSImage file system image version check throw a (slightly) wrong parameter.
 ---

 Key: HDFS-6103
 URL: https://issues.apache.org/jira/browse/HDFS-6103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: jun aoki
Priority: Trivial

 Trivial error message issue:
 When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start 
 namenode with upgrade option.
 e.g. 
 {code}
 sudo service namenode upgrade
 {code}
 That said, the actual error while without the option said -upgrade (with a 
 hyphen) 
 {code}
 2014-03-13 23:38:15,488 FATAL 
 org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
 java.io.IOException:
 File system image contains an old layout version -40.
 An upgrade to version -47 is required.
 Please restart NameNode with -upgrade option.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
 2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
 status 1
 2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
 /
 ~
 {code}
 I'm referring to 2.0.5 above, 
 https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225
 I haven't tried the trunk but it seems to return UPGRADE (all upper case) 
 which again anther slightly wrong error description.
 https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms