[jira] [Updated] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6085:
---

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930011#comment-13930011
 ] 

Colin Patrick McCabe commented on HDFS-6085:


bq. This looks good. Do you want to try lowering the log level for 
CacheManager#cacheRepors in this patch as well? Right now it's pretty spammy at 
INFO, and I imagine it being even worse on a large cluster.

I'll roll that into HDFS-6086.

bq. Otherwise, +1 pending.

Thanks, committed.

 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6086:
---

Attachment: HDFS-6086.002.patch

* Fix compilation failure due to updating FSDatasetSpi interface
* change log level of cache report acknowledgement log message in CacheManager

 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930018#comment-13930018
 ] 

Hudson commented on HDFS-6085:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5304 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5304/])
HDFS-6085. Improve CacheReplicationMonitor log messages a bit (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576194)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java


 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930040#comment-13930040
 ] 

Hadoop QA commented on HDFS-6080:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633847/HDFS-6080.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-nfs:

  org.apache.hadoop.fs.TestHdfsNativeCodeLoader

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6369//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6369//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6369//console

This message is automatically generated.

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | 

[jira] [Updated] (HDFS-5196) Provide more snapshot information in WebUI

2014-03-11 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HDFS-5196:
-

Attachment: HDFS-5196-2.patch

I attach a patch file using only new web UI.

 Provide more snapshot information in WebUI
 --

 Key: HDFS-5196
 URL: https://issues.apache.org/jira/browse/HDFS-5196
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Haohui Mai
Assignee: Shinichi Yamashita
Priority: Minor
 Attachments: HDFS-5196-2.patch, HDFS-5196.patch, HDFS-5196.patch, 
 HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, 
 snapshotteddir.png


 The WebUI should provide more detailed information about snapshots, such as 
 all snapshottable directories and corresponding number of snapshots 
 (suggested in HDFS-4096).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930106#comment-13930106
 ] 

Hadoop QA commented on HDFS-6086:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633862/HDFS-6086.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6370//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6370//console

This message is automatically generated.

 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-11 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930186#comment-13930186
 ] 

Vinayakumar B commented on HDFS-5638:
-

Thanks Chris for review and splitting. 

 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5196) Provide more snapshot information in WebUI

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930194#comment-13930194
 ] 

Hadoop QA commented on HDFS-5196:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633870/HDFS-5196-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6371//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6371//console

This message is automatically generated.

 Provide more snapshot information in WebUI
 --

 Key: HDFS-5196
 URL: https://issues.apache.org/jira/browse/HDFS-5196
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Haohui Mai
Assignee: Shinichi Yamashita
Priority: Minor
 Attachments: HDFS-5196-2.patch, HDFS-5196.patch, HDFS-5196.patch, 
 HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, 
 snapshotteddir.png


 The WebUI should provide more detailed information about snapshots, such as 
 all snapshottable directories and corresponding number of snapshots 
 (suggested in HDFS-4096).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930225#comment-13930225
 ] 

Hudson commented on HDFS-5535:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
Move HDFS-5535 to Release 2.4.0 in CHANGES.txt. (szetszwo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576148)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Umbrella jira for improved HDFS rolling upgrades
 

 Key: HDFS-5535
 URL: https://issues.apache.org/jira/browse/HDFS-5535
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, ha, hdfs-client, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Nathan Roberts
Assignee: Tsz Wo Nicholas Sze
 Fix For: 2.4.0

 Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v2.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v3.pdf, h5535_20140219.patch, 
 h5535_20140220-1554.patch, h5535_20140220b.patch, h5535_20140221-2031.patch, 
 h5535_20140224-1931.patch, h5535_20140225-1225.patch, 
 h5535_20140226-1328.patch, h5535_20140226-1911.patch, 
 h5535_20140227-1239.patch, h5535_20140228-1714.patch, 
 h5535_20140304-1138.patch, h5535_20140304-branch-2.patch, 
 h5535_20140310-branch-2.patch, hdfs-5535-test-plan.pdf


 In order to roll a new HDFS release through a large cluster quickly and 
 safely, a few enhancements are needed in HDFS. An initial High level design 
 document will be attached to this jira, and sub-jiras will itemize the 
 individual tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930227#comment-13930227
 ] 

Hudson commented on HDFS-3405:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
Move HDFS-3405 to 2.4.0 section in CHANGES.txt (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576158)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged 
 fsimages
 

 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
Reporter: Aaron T. Myers
Assignee: Vinayakumar B
 Fix For: 2.4.0

 Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch


 As Todd points out in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
  the current scheme for a checkpointing daemon to upload a merged fsimage 
 file to an NN is to issue an HTTP get request to tell the target NN to issue 
 another GET request back to the checkpointing daemon to retrieve the merged 
 fsimage file. There's no fundamental reason the checkpointing daemon can't 
 just use an HTTP POST or PUT to send back the merged fsimage file, rather 
 than the double-GET scheme.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6070) Cleanup use of ReadStatistics in DFSInputStream

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930231#comment-13930231
 ] 

Hudson commented on HDFS-6070:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
HDFS-6070. Cleanup use of ReadStatistics in DFSInputStream. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576047)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java


 Cleanup use of ReadStatistics in DFSInputStream
 ---

 Key: HDFS-6070
 URL: https://issues.apache.org/jira/browse/HDFS-6070
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Fix For: 2.4.0

 Attachments: hdfs-6070.patch


 Trivial little code cleanup related to DFSInputStream#ReadStatistics to use 
 update methods rather than reaching in directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5892) TestDeleteBlockPool fails in branch-2

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930237#comment-13930237
 ] 

Hudson commented on HDFS-5892:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
HDFS-5892. TestDeleteBlockPool fails in branch-2. Contributed by Ted Yu. 
(wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576035)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSNNTopology.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java


 TestDeleteBlockPool fails in branch-2
 -

 Key: HDFS-5892
 URL: https://issues.apache.org/jira/browse/HDFS-5892
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5892.patch, 
 org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool-output.txt


 Running test suite on Linux, I got:
 {code}
 testDeleteBlockPool(org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool)
   Time elapsed: 8.143 sec   ERROR!
 java.io.IOException: All datanodes 127.0.0.1:43721 are bad. Aborting...
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6077) running slive with webhdfs on secure HA cluster fails with unkown host exception

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930235#comment-13930235
 ] 

Hudson commented on HDFS-6077:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
HDFS-6077. running slive with webhdfs on secure HA cluster fails with unkown 
host exception. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576076)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


 running slive with webhdfs on secure HA cluster fails with unkown host 
 exception
 

 Key: HDFS-6077
 URL: https://issues.apache.org/jira/browse/HDFS-6077
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.4.0

 Attachments: HDFS-6077.000.patch


 SliveTest fails with following.  See the comment for more logs.
 {noformat}
 SliveTest: Unable to run job due to error:
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-2-secure
 at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
 at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:258)
 at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:299)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930236#comment-13930236
 ] 

Hudson commented on HDFS-6085:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
HDFS-6085. Improve CacheReplicationMonitor log messages a bit (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576194)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java


 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6055) Change default configuration to limit file name length in HDFS

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930233#comment-13930233
 ] 

Hudson commented on HDFS-6055:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #506 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/506/])
HDFS-6055. Change default configuration to limit file name length in HDFS. 
Contributed by Chris Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576095)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestSymlinkHdfs.java


 Change default configuration to limit file name length in HDFS
 --

 Key: HDFS-6055
 URL: https://issues.apache.org/jira/browse/HDFS-6055
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Suresh Srinivas
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-6055.1.patch, HDFS-6055.2.patch


 Currently configuration dfs.namenode.fs-limits.max-component-length is set 
 to 0. With this HDFS file names have no length limit. However, we see more 
 users run into issues where they copy files from HDFS to another file system 
 and the copy fails due to the file name length being too long.
 I propose changing the default configuration 
 dfs.namenode.fs-limits.max-component-length to a reasonable value. This 
 will be an incompatible change. However, user who need long file names can 
 override this configuration to turn off length limit.
 What do folks think?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930348#comment-13930348
 ] 

Hudson commented on HDFS-5535:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
Move HDFS-5535 to Release 2.4.0 in CHANGES.txt. (szetszwo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576148)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Umbrella jira for improved HDFS rolling upgrades
 

 Key: HDFS-5535
 URL: https://issues.apache.org/jira/browse/HDFS-5535
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, ha, hdfs-client, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Nathan Roberts
Assignee: Tsz Wo Nicholas Sze
 Fix For: 2.4.0

 Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v2.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v3.pdf, h5535_20140219.patch, 
 h5535_20140220-1554.patch, h5535_20140220b.patch, h5535_20140221-2031.patch, 
 h5535_20140224-1931.patch, h5535_20140225-1225.patch, 
 h5535_20140226-1328.patch, h5535_20140226-1911.patch, 
 h5535_20140227-1239.patch, h5535_20140228-1714.patch, 
 h5535_20140304-1138.patch, h5535_20140304-branch-2.patch, 
 h5535_20140310-branch-2.patch, hdfs-5535-test-plan.pdf


 In order to roll a new HDFS release through a large cluster quickly and 
 safely, a few enhancements are needed in HDFS. An initial High level design 
 document will be attached to this jira, and sub-jiras will itemize the 
 individual tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6070) Cleanup use of ReadStatistics in DFSInputStream

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930355#comment-13930355
 ] 

Hudson commented on HDFS-6070:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
HDFS-6070. Cleanup use of ReadStatistics in DFSInputStream. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576047)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java


 Cleanup use of ReadStatistics in DFSInputStream
 ---

 Key: HDFS-6070
 URL: https://issues.apache.org/jira/browse/HDFS-6070
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Fix For: 2.4.0

 Attachments: hdfs-6070.patch


 Trivial little code cleanup related to DFSInputStream#ReadStatistics to use 
 update methods rather than reaching in directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930360#comment-13930360
 ] 

Hudson commented on HDFS-6085:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
HDFS-6085. Improve CacheReplicationMonitor log messages a bit (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576194)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java


 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6077) running slive with webhdfs on secure HA cluster fails with unkown host exception

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930359#comment-13930359
 ] 

Hudson commented on HDFS-6077:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
HDFS-6077. running slive with webhdfs on secure HA cluster fails with unkown 
host exception. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576076)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


 running slive with webhdfs on secure HA cluster fails with unkown host 
 exception
 

 Key: HDFS-6077
 URL: https://issues.apache.org/jira/browse/HDFS-6077
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.4.0

 Attachments: HDFS-6077.000.patch


 SliveTest fails with following.  See the comment for more logs.
 {noformat}
 SliveTest: Unable to run job due to error:
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-2-secure
 at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
 at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:258)
 at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:299)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6055) Change default configuration to limit file name length in HDFS

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930357#comment-13930357
 ] 

Hudson commented on HDFS-6055:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
HDFS-6055. Change default configuration to limit file name length in HDFS. 
Contributed by Chris Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576095)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestSymlinkHdfs.java


 Change default configuration to limit file name length in HDFS
 --

 Key: HDFS-6055
 URL: https://issues.apache.org/jira/browse/HDFS-6055
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Suresh Srinivas
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-6055.1.patch, HDFS-6055.2.patch


 Currently configuration dfs.namenode.fs-limits.max-component-length is set 
 to 0. With this HDFS file names have no length limit. However, we see more 
 users run into issues where they copy files from HDFS to another file system 
 and the copy fails due to the file name length being too long.
 I propose changing the default configuration 
 dfs.namenode.fs-limits.max-component-length to a reasonable value. This 
 will be an incompatible change. However, user who need long file names can 
 override this configuration to turn off length limit.
 What do folks think?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5892) TestDeleteBlockPool fails in branch-2

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930361#comment-13930361
 ] 

Hudson commented on HDFS-5892:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1698 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1698/])
HDFS-5892. TestDeleteBlockPool fails in branch-2. Contributed by Ted Yu. 
(wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576035)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSNNTopology.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java


 TestDeleteBlockPool fails in branch-2
 -

 Key: HDFS-5892
 URL: https://issues.apache.org/jira/browse/HDFS-5892
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5892.patch, 
 org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool-output.txt


 Running test suite on Linux, I got:
 {code}
 testDeleteBlockPool(org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool)
   Time elapsed: 8.143 sec   ERROR!
 java.io.IOException: All datanodes 127.0.0.1:43721 are bad. Aborting...
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-11 Thread Guo Ruijing (JIRA)
Guo Ruijing created HDFS-6087:
-

 Summary: Unify HDFS write/append/truncate
 Key: HDFS-6087
 URL: https://issues.apache.org/jira/browse/HDFS-6087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Guo Ruijing
 Attachments: HDFS Design Proposal.pdf

In existing implementation, HDFS file can be appended and HDFS block can be 
reopened for append. This design will introduce complexity including lease 
recovery. If we design HDFS block as immutable, it will be very simple for 
append  truncate. The idea is that HDFS block is immutable if the block is 
committed to namenode. If the block is not committed to namenode, it is HDFS 
client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-11 Thread Guo Ruijing (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guo Ruijing updated HDFS-6087:
--

Attachment: HDFS Design Proposal.pdf

 Unify HDFS write/append/truncate
 

 Key: HDFS-6087
 URL: https://issues.apache.org/jira/browse/HDFS-6087
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Guo Ruijing
 Attachments: HDFS Design Proposal.pdf


 In existing implementation, HDFS file can be appended and HDFS block can be 
 reopened for append. This design will introduce complexity including lease 
 recovery. If we design HDFS block as immutable, it will be very simple for 
 append  truncate. The idea is that HDFS block is immutable if the block is 
 committed to namenode. If the block is not committed to namenode, it is HDFS 
 client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930419#comment-13930419
 ] 

Hudson commented on HDFS-5535:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
Move HDFS-5535 to Release 2.4.0 in CHANGES.txt. (szetszwo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576148)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Umbrella jira for improved HDFS rolling upgrades
 

 Key: HDFS-5535
 URL: https://issues.apache.org/jira/browse/HDFS-5535
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, ha, hdfs-client, namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Nathan Roberts
Assignee: Tsz Wo Nicholas Sze
 Fix For: 2.4.0

 Attachments: HDFSRollingUpgradesHighLevelDesign.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v2.pdf, 
 HDFSRollingUpgradesHighLevelDesign.v3.pdf, h5535_20140219.patch, 
 h5535_20140220-1554.patch, h5535_20140220b.patch, h5535_20140221-2031.patch, 
 h5535_20140224-1931.patch, h5535_20140225-1225.patch, 
 h5535_20140226-1328.patch, h5535_20140226-1911.patch, 
 h5535_20140227-1239.patch, h5535_20140228-1714.patch, 
 h5535_20140304-1138.patch, h5535_20140304-branch-2.patch, 
 h5535_20140310-branch-2.patch, hdfs-5535-test-plan.pdf


 In order to roll a new HDFS release through a large cluster quickly and 
 safely, a few enhancements are needed in HDFS. An initial High level design 
 document will be attached to this jira, and sub-jiras will itemize the 
 individual tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930421#comment-13930421
 ] 

Hudson commented on HDFS-3405:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
Move HDFS-3405 to 2.4.0 section in CHANGES.txt (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576158)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged 
 fsimages
 

 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
Reporter: Aaron T. Myers
Assignee: Vinayakumar B
 Fix For: 2.4.0

 Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, 
 HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch


 As Todd points out in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
  the current scheme for a checkpointing daemon to upload a merged fsimage 
 file to an NN is to issue an HTTP get request to tell the target NN to issue 
 another GET request back to the checkpointing daemon to retrieve the merged 
 fsimage file. There's no fundamental reason the checkpointing daemon can't 
 just use an HTTP POST or PUT to send back the merged fsimage file, rather 
 than the double-GET scheme.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6055) Change default configuration to limit file name length in HDFS

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930427#comment-13930427
 ] 

Hudson commented on HDFS-6055:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
HDFS-6055. Change default configuration to limit file name length in HDFS. 
Contributed by Chris Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576095)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestSymlinkHdfs.java


 Change default configuration to limit file name length in HDFS
 --

 Key: HDFS-6055
 URL: https://issues.apache.org/jira/browse/HDFS-6055
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0, 2.4.0
Reporter: Suresh Srinivas
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-6055.1.patch, HDFS-6055.2.patch


 Currently configuration dfs.namenode.fs-limits.max-component-length is set 
 to 0. With this HDFS file names have no length limit. However, we see more 
 users run into issues where they copy files from HDFS to another file system 
 and the copy fails due to the file name length being too long.
 I propose changing the default configuration 
 dfs.namenode.fs-limits.max-component-length to a reasonable value. This 
 will be an incompatible change. However, user who need long file names can 
 override this configuration to turn off length limit.
 What do folks think?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6077) running slive with webhdfs on secure HA cluster fails with unkown host exception

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930429#comment-13930429
 ] 

Hudson commented on HDFS-6077:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
HDFS-6077. running slive with webhdfs on secure HA cluster fails with unkown 
host exception. Contributed by Jing Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576076)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


 running slive with webhdfs on secure HA cluster fails with unkown host 
 exception
 

 Key: HDFS-6077
 URL: https://issues.apache.org/jira/browse/HDFS-6077
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Fix For: 2.4.0

 Attachments: HDFS-6077.000.patch


 SliveTest fails with following.  See the comment for more logs.
 {noformat}
 SliveTest: Unable to run job due to error:
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-2-secure
 at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
 at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:258)
 at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:299)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5892) TestDeleteBlockPool fails in branch-2

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930431#comment-13930431
 ] 

Hudson commented on HDFS-5892:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
HDFS-5892. TestDeleteBlockPool fails in branch-2. Contributed by Ted Yu. 
(wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576035)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSNNTopology.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java


 TestDeleteBlockPool fails in branch-2
 -

 Key: HDFS-5892
 URL: https://issues.apache.org/jira/browse/HDFS-5892
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-5892.patch, 
 org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool-output.txt


 Running test suite on Linux, I got:
 {code}
 testDeleteBlockPool(org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool)
   Time elapsed: 8.143 sec   ERROR!
 java.io.IOException: All datanodes 127.0.0.1:43721 are bad. Aborting...
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6085) Improve CacheReplicationMonitor log messages a bit

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930430#comment-13930430
 ] 

Hudson commented on HDFS-6085:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
HDFS-6085. Improve CacheReplicationMonitor log messages a bit (cmccabe) 
(cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576194)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java


 Improve CacheReplicationMonitor log messages a bit
 --

 Key: HDFS-6085
 URL: https://issues.apache.org/jira/browse/HDFS-6085
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6085.001.patch


 It would be nice if the CacheReplicationMonitor logs would print out 
 information about blocks when at TRACE level.  We also could be a bit more 
 organized about logs and include the directive ID in the log, so that it was 
 clear what each log message referred to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6070) Cleanup use of ReadStatistics in DFSInputStream

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930425#comment-13930425
 ] 

Hudson commented on HDFS-6070:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
HDFS-6070. Cleanup use of ReadStatistics in DFSInputStream. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576047)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java


 Cleanup use of ReadStatistics in DFSInputStream
 ---

 Key: HDFS-6070
 URL: https://issues.apache.org/jira/browse/HDFS-6070
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Fix For: 2.4.0

 Attachments: hdfs-6070.patch


 Trivial little code cleanup related to DFSInputStream#ReadStatistics to use 
 update methods rather than reaching in directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930535#comment-13930535
 ] 

Hudson commented on HDFS-5638:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5305 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5305/])
HDFS-5638. HDFS implementation of FileContext API for ACLs. Contributed by 
Vinayakumar B. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576405)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/fs/Hdfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileContextAcl.java


 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5638) HDFS implementation of FileContext API for ACLs.

2014-03-11 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5638:


   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

I committed this to trunk, branch-2 and branch-2.4.  Thanks again for the 
contributions, Vinay!

 HDFS implementation of FileContext API for ACLs.
 

 Key: HDFS-5638
 URL: https://issues.apache.org/jira/browse/HDFS-5638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: HDFS ACLs (HDFS-4685)
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Fix For: 3.0.0, 2.4.0

 Attachments: HDFS-5638.2.patch, HDFS-5638.patch, HDFS-5638.patch, 
 HDFS-5638.patch


 Add new methods to {{AbstractFileSystem}} and {{FileContext}} for 
 manipulating ACLs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6088) Add configurable maximum block count for datanode

2014-03-11 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-6088:


 Summary: Add configurable maximum block count for datanode
 Key: HDFS-6088
 URL: https://issues.apache.org/jira/browse/HDFS-6088
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee


Currently datanode resources are protected by the free space check and the 
balancer.  But datanodes can run out of memory simply storing too many blocks. 
If the sizes of blocks are small, datanodes will appear to have plenty of space 
to put more blocks.

I propose adding a configurable max block count to datanode. Since datanodes 
can have different heap configurations, it will make sense to make it 
datanode-level, rather than something enforced by namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6088) Add configurable maximum block count for datanode

2014-03-11 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930555#comment-13930555
 ] 

Kihwal Lee commented on HDFS-6088:
--

It will be nice though if NN knows what's going on, so that block placement 
policy can avoid picking the full nodes.  DN could include its free block 
count in heartbeat.

 Add configurable maximum block count for datanode
 -

 Key: HDFS-6088
 URL: https://issues.apache.org/jira/browse/HDFS-6088
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee

 Currently datanode resources are protected by the free space check and the 
 balancer.  But datanodes can run out of memory simply storing too many 
 blocks. If the sizes of blocks are small, datanodes will appear to have 
 plenty of space to put more blocks.
 I propose adding a configurable max block count to datanode. Since datanodes 
 can have different heap configurations, it will make sense to make it 
 datanode-level, rather than something enforced by namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint

2014-03-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930614#comment-13930614
 ] 

Yongjun Zhang commented on HDFS-5944:
-

Thanks [~zhaoyunjiong] for reporting the issue and the fix. 
Hi [~brandonli], thanks for reviewing and committing the fix.  It's said to be 
fixed in 2.4.0 but I don't see it in branch-2.4. Would you please check? Thanks.



 LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and 
 cause SecondaryNameNode failed do checkpoint
 

 Key: HDFS-5944
 URL: https://issues.apache.org/jira/browse/HDFS-5944
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.3.0, 2.4.0

 Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, 
 HDFS-5944.test.txt, HDFS-5944.trunk.patch


 In our cluster, we encountered error like this:
 java.io.IOException: saveLeases found path 
 /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction.
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949)
 What happened:
 Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write.
 And Client A continue refresh it's lease.
 Client B deleted /XXX/20140206/04_30/
 Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write
 Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log
 Then secondaryNameNode try to do checkpoint and failed due to failed to 
 delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/.
 The reason is a bug in findLeaseWithPrefixPath:
  int srclen = prefix.length();
  if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) {
 entries.put(entry.getKey(), entry.getValue());
   }
 Here when prefix is /XXX/20140206/04_30/, and p is 
 /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'.
 The fix is simple, I'll upload patch later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-11 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-6007:
---

Attachment: HDFS-6007-3.patch

added description about shared memory segments.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6035) TestCacheDirectives#testCacheManagerRestart is failing on branch-2

2014-03-11 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930720#comment-13930720
 ] 

Mit Desai commented on HDFS-6035:
-

[~sathish.gurram], Can you let me know what branch are you testing this on?

 TestCacheDirectives#testCacheManagerRestart is failing on branch-2
 --

 Key: HDFS-6035
 URL: https://issues.apache.org/jira/browse/HDFS-6035
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Mit Desai
Assignee: sathish
 Attachments: HDFS-6035-0001.patch


 {noformat}
 java.io.IOException: Inconsistent checkpoint fields.
 LV = -51 namespaceID = 1641397469 cTime = 0 ; clusterId = testClusterID ; 
 blockpoolId = BP-423574854-x.x.x.x-1393478669835.
 Expecting respectively: -51; 2; 0; testClusterID; 
 BP-2051361571-x.x.x.x-1393478572877.
   at 
 org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:133)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:526)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testCacheManagerRestart(TestCacheDirectives.java:582)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5274) Add Tracing to HDFS

2014-03-11 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-5274:
---

Attachment: 3node_put_200mb.png

 Add Tracing to HDFS
 ---

 Key: HDFS-5274
 URL: https://issues.apache.org/jira/browse/HDFS-5274
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 2.1.1-beta
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: 3node_get_200mb.png, 3node_put_200mb.png, 
 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, 
 HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, 
 HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, 
 HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, HDFS-5274-8.patch, 
 HDFS-5274-9.patch, Zipkin   Trace a06e941b0172ec73.png, Zipkin   Trace 
 d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png


 Since Google's Dapper paper has shown the benefits of tracing for a large 
 distributed system, it seems like a good time to add tracing to HDFS.  HBase 
 has added tracing using HTrace.  I propose that the same can be done within 
 HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5274) Add Tracing to HDFS

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930756#comment-13930756
 ] 

Hadoop QA commented on HDFS-5274:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633965/3node_put_200mb.png
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6373//console

This message is automatically generated.

 Add Tracing to HDFS
 ---

 Key: HDFS-5274
 URL: https://issues.apache.org/jira/browse/HDFS-5274
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 2.1.1-beta
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: 3node_get_200mb.png, 3node_put_200mb.png, 
 3node_put_200mb.png, HDFS-5274-0.patch, HDFS-5274-1.patch, 
 HDFS-5274-10.patch, HDFS-5274-11.txt, HDFS-5274-12.patch, HDFS-5274-13.patch, 
 HDFS-5274-2.patch, HDFS-5274-3.patch, HDFS-5274-4.patch, HDFS-5274-5.patch, 
 HDFS-5274-6.patch, HDFS-5274-7.patch, HDFS-5274-8.patch, HDFS-5274-8.patch, 
 HDFS-5274-9.patch, Zipkin   Trace a06e941b0172ec73.png, Zipkin   Trace 
 d0f0d66b8a258a69.png, ss-5274v8-get.png, ss-5274v8-put.png


 Since Google's Dapper paper has shown the benefits of tracing for a large 
 distributed system, it seems like a good time to add tracing to HDFS.  HBase 
 has added tracing using HTrace.  I propose that the same can be done within 
 HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-4461) DirectoryScanner: volume path prefix takes up memory for every block that is scanned

2014-03-11 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4461:
-

Attachment: HDFS-4461.branch-0.23.patch

I thought we can wait till 2.x, but some 0.23 users are creating a lot of small 
files (i.e. small blocks) and DNs are running out of memory when 
DirectoryScanner runs. The peak heap usage can be almost 2x or even 3x of the 
base usage, if one dir scan garbage survives until the next scan.

The patch is a straight back-port of the trunk version. The difference comes 
from the fact that a source file got split into multiple files in 
branch-2/trunk. Other than that the core change is exactly the same.

 DirectoryScanner: volume path prefix takes up memory for every block that is 
 scanned 
 -

 Key: HDFS-4461
 URL: https://issues.apache.org/jira/browse/HDFS-4461
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.1.0-beta

 Attachments: HDFS-4461.002.patch, HDFS-4461.003.patch, 
 HDFS-4461.004.patch, HDFS-4461.branch-0.23.patch, HDFS-4661.006.patch, 
 memory-analysis.png


 In the {{DirectoryScanner}}, we create a class {{ScanInfo}} for every block.  
 This object contains two File objects-- one for the metadata file, and one 
 for the block file.  Since those File objects contain full paths, users who 
 pick a lengthly path for their volume roots will end up using an extra 
 N_blocks * path_prefix bytes per block scanned.  We also don't really need to 
 store File objects-- storing strings and then creating File objects as needed 
 would be cheaper.  This would be a nice efficiency improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-4461) DirectoryScanner: volume path prefix takes up memory for every block that is scanned

2014-03-11 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-4461:
-

Fix Version/s: 0.23.11

 DirectoryScanner: volume path prefix takes up memory for every block that is 
 scanned 
 -

 Key: HDFS-4461
 URL: https://issues.apache.org/jira/browse/HDFS-4461
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.1.0-beta, 0.23.11

 Attachments: HDFS-4461.002.patch, HDFS-4461.003.patch, 
 HDFS-4461.004.patch, HDFS-4461.branch-0.23.patch, HDFS-4661.006.patch, 
 memory-analysis.png


 In the {{DirectoryScanner}}, we create a class {{ScanInfo}} for every block.  
 This object contains two File objects-- one for the metadata file, and one 
 for the block file.  Since those File objects contain full paths, users who 
 pick a lengthly path for their volume roots will end up using an extra 
 N_blocks * path_prefix bytes per block scanned.  We also don't really need to 
 store File objects-- storing strings and then creating File objects as needed 
 would be cheaper.  This would be a nice efficiency improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4461) DirectoryScanner: volume path prefix takes up memory for every block that is scanned

2014-03-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930815#comment-13930815
 ] 

Colin Patrick McCabe commented on HDFS-4461:


+1 for the backport.  Note that I have not tested it, just reviewed it

 DirectoryScanner: volume path prefix takes up memory for every block that is 
 scanned 
 -

 Key: HDFS-4461
 URL: https://issues.apache.org/jira/browse/HDFS-4461
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.1.0-beta, 0.23.11

 Attachments: HDFS-4461.002.patch, HDFS-4461.003.patch, 
 HDFS-4461.004.patch, HDFS-4461.branch-0.23.patch, HDFS-4661.006.patch, 
 memory-analysis.png


 In the {{DirectoryScanner}}, we create a class {{ScanInfo}} for every block.  
 This object contains two File objects-- one for the metadata file, and one 
 for the block file.  Since those File objects contain full paths, users who 
 pick a lengthly path for their volume roots will end up using an extra 
 N_blocks * path_prefix bytes per block scanned.  We also don't really need to 
 store File objects-- storing strings and then creating File objects as needed 
 would be cheaper.  This would be a nice efficiency improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930877#comment-13930877
 ] 

Hadoop QA commented on HDFS-6007:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633949/HDFS-6007-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6372//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6372//console

This message is automatically generated.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint

2014-03-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930951#comment-13930951
 ] 

Brandon Li commented on HDFS-5944:
--

I forgot to append the jira number when doing the back porting.
Usually I don't forget. But, sorry for this one. Here are the log entries in 
branch-2 and 2.4.

For branch-2:
r1570372 | brandonli | 2014-02-20 14:32:49 -0800 (Thu, 20 Feb 2014) | 1 line

Merging change r1570366 from trunk

For branch-2.4:
r1570377 | brandonli | 2014-02-20 14:40:00 -0800 (Thu, 20 Feb 2014) | 1 line

Merging change r1570372 from branch-2

 LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and 
 cause SecondaryNameNode failed do checkpoint
 

 Key: HDFS-5944
 URL: https://issues.apache.org/jira/browse/HDFS-5944
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.3.0, 2.4.0

 Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, 
 HDFS-5944.test.txt, HDFS-5944.trunk.patch


 In our cluster, we encountered error like this:
 java.io.IOException: saveLeases found path 
 /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction.
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949)
 What happened:
 Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write.
 And Client A continue refresh it's lease.
 Client B deleted /XXX/20140206/04_30/
 Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write
 Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log
 Then secondaryNameNode try to do checkpoint and failed due to failed to 
 delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/.
 The reason is a bug in findLeaseWithPrefixPath:
  int srclen = prefix.length();
  if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) {
 entries.put(entry.getKey(), entry.getValue());
   }
 Here when prefix is /XXX/20140206/04_30/, and p is 
 /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'.
 The fix is simple, I'll upload patch later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)
Arpit Gupta created HDFS-6089:
-

 Summary: Standby NN while transitioning to active throws a 
connection refused error when the prior active NN process is suspended
 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao


The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930954#comment-13930954
 ] 

Arpit Gupta commented on HDFS-6089:
---

Here is the console log

{code}
sudo su - -c /usr/bin/hdfs haadmin -getServiceState nn1 hdfs
active
exit code = 0
sudo su - -c /usr/bin/hdfs haadmin -getServiceState nn2 hdfs
standby
exit code = 0
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null hostname sudo 
su - -c \cat /grid/0/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid | xargs kill 
-19\ hdfs
sudo su - -c /usr/bin/hdfs haadmin -getServiceState nn1 hdfs
Operation failed: Call From host1/ip to host1:8020 failed on socket timeout 
exception: java.net.SocketTimeoutException: 2 millis timeout while waiting 
for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=host1/ip:35192 
remote=host1/ip:8020]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
exit code = 255
sudo su - -c /usr/bin/hdfs haadmin -getServiceState nn2 hdfs
Operation failed: Call From host2/ip to host2:8020 failed on socket timeout 
exception: java.net.SocketTimeoutException: 2 millis timeout while waiting 
for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=host2/ip:37640 
remote=host2/68.142.247.217:8020]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
exit code = 255
{code}

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao

 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-6089:
--

Description: 
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out exception.

  was:
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out connection.


 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao

 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6090) Use MiniDFSCluster.Builder instead of deprecated constructors

2014-03-11 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HDFS-6090:
---

 Summary: Use MiniDFSCluster.Builder instead of deprecated 
constructors
 Key: HDFS-6090
 URL: https://issues.apache.org/jira/browse/HDFS-6090
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Reporter: Akira AJISAKA
Priority: Minor


Some test classes are using deprecated constructors such as 
{{MiniDFSCluster(Configuration, int, boolean, String[], String[])}} for 
building a MiniDFSCluster.
These classes should use {{MiniDFSCluster.Builder}} to reduce javac warnings 
and improve code readability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4461) DirectoryScanner: volume path prefix takes up memory for every block that is scanned

2014-03-11 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930966#comment-13930966
 ] 

Kihwal Lee commented on HDFS-4461:
--

Thanks, Colin. I've checked it into branch-0.23.

 DirectoryScanner: volume path prefix takes up memory for every block that is 
 scanned 
 -

 Key: HDFS-4461
 URL: https://issues.apache.org/jira/browse/HDFS-4461
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.1.0-beta, 0.23.11

 Attachments: HDFS-4461.002.patch, HDFS-4461.003.patch, 
 HDFS-4461.004.patch, HDFS-4461.branch-0.23.patch, HDFS-4661.006.patch, 
 memory-analysis.png


 In the {{DirectoryScanner}}, we create a class {{ScanInfo}} for every block.  
 This object contains two File objects-- one for the metadata file, and one 
 for the block file.  Since those File objects contain full paths, users who 
 pick a lengthly path for their volume roots will end up using an extra 
 N_blocks * path_prefix bytes per block scanned.  We also don't really need to 
 store File objects-- storing strings and then creating File objects as needed 
 would be cheaper.  This would be a nice efficiency improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint

2014-03-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930976#comment-13930976
 ] 

Yongjun Zhang commented on HDFS-5944:
-

Hi Brandon, many thanks for the clarification.


 LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and 
 cause SecondaryNameNode failed do checkpoint
 

 Key: HDFS-5944
 URL: https://issues.apache.org/jira/browse/HDFS-5944
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 1.2.0, 2.2.0
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.3.0, 2.4.0

 Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, 
 HDFS-5944.test.txt, HDFS-5944.trunk.patch


 In our cluster, we encountered error like this:
 java.io.IOException: saveLeases found path 
 /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction.
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949)
 What happened:
 Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write.
 And Client A continue refresh it's lease.
 Client B deleted /XXX/20140206/04_30/
 Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write
 Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log
 Then secondaryNameNode try to do checkpoint and failed due to failed to 
 delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/.
 The reason is a bug in findLeaseWithPrefixPath:
  int srclen = prefix.length();
  if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) {
 entries.put(entry.getKey(), entry.getValue());
   }
 Here when prefix is /XXX/20140206/04_30/, and p is 
 /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'.
 The fix is simple, I'll upload patch later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930973#comment-13930973
 ] 

Jing Zhao commented on HDFS-6072:
-

+1 for the new patch. Thanks for the cleaning, Haohui!

 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6091) dfs.journalnode.edits.dir should accept URI

2014-03-11 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HDFS-6091:
--

 Summary: dfs.journalnode.edits.dir should accept URI
 Key: HDFS-6091
 URL: https://issues.apache.org/jira/browse/HDFS-6091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node
Affects Versions: 2.2.0
Reporter: Allen Wittenauer
Priority: Minor


Using a URI in dfs.journalnode.edits.dir (such as file:///foo)  throws a 
Journal dir 'file:/foo' should be an absolute path'. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6072) Clean up dead code of FSImage

2014-03-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6072:
-

   Resolution: Fixed
Fix Version/s: 2.4.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~ajisakaa] 
and [~jingzhao] for the review.

 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage

2014-03-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931026#comment-13931026
 ] 

Haohui Mai commented on HDFS-6084:
--

Some users have expressed concerns on potential security issues on the external 
links. They are concerned that when a user clicks on the external link, the 
referrer header in HTTP requests might leak sensitive information (e.g., the 
path of a directory).

I guess that we can leave the text here but remove all external links. 
[~tthompso], do you have any suggestions?

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Attachments: HDFS-6084.1.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6072) Clean up dead code of FSImage

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931031#comment-13931031
 ] 

Hudson commented on HDFS-6072:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5306 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5306/])
HDFS-6072. Clean up dead code of FSImage. Contributed by Haohui Mai. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576513)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiff.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Clean up dead code of FSImage
 -

 Key: HDFS-6072
 URL: https://issues.apache.org/jira/browse/HDFS-6072
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-6072.000.patch, HDFS-6072.001.patch, 
 HDFS-6072.002.patch


 After HDFS-5698 HDFS store FSImage in protobuf format. The old code of saving 
 the FSImage is now dead, which should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931033#comment-13931033
 ] 

Brandon Li commented on HDFS-6080:
--

[~ashahab], given your test result and hdfs file size is in MB size usually, I 
think we may want to keep 1MB as default. The user can always change it to a 
smaller size when needed. What do you think?

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931036#comment-13931036
 ] 

Andrew Wang commented on HDFS-6086:
---

+1 looks good, thanks Colin

 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5196) Provide more snapshot information in WebUI

2014-03-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931042#comment-13931042
 ] 

Haohui Mai commented on HDFS-5196:
--

The patch mostly looks good.

{code}
diff --git 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java
 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java
index 43952be..cb0bf79 100644
--- 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java
+++ 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java
@@ -243,6 +243,8 @@ private static void setupServlets(HttpServer2 httpServer, 
Configuration conf) {
 FileChecksumServlets.RedirectServlet.class, false);
 httpServer.addInternalServlet(contentSummary, /contentSummary/*,
 ContentSummaryServlet.class, false);
+httpServer.addInternalServlet(snapshot, 
+SnapshotInfoServlet.PATH_SPEC, SnapshotInfoServlet.class, false);
   }
{code}

It might be more appropriate to put the information in jmx. For example, in 
{{FSNamesystemState}}.

{code}
+  public Object getSnapshottableDirs() throws IOException {
+//MapString, MapString,Object info = new HashMapString, MapString, 
Object();
+ListMap info = new ArrayListMap();
+SnapshottableDirectoryStatus[] stats = getSnapshottableDirListing();
+if (stats == null) {
+  return {};
...
{code}

The code should return a {{MXBean}} instead of JSON string. Otherwise it 
requires hacks and workarounds in the client side to parse the JSON. Please see 
HDFS-6013 if you have more questions.

{code}
+alert(err);

...

-  })).error(ajax_error_handler);
+  }),
+  function (url, jqxhr, text, err) {
+show_err_msg('pFailed to retrieve data from ' + url + ', cause: ' + 
err + '/p');
+  });
{code}

These changes look unrelated. Can you please remove them from this patch?

 Provide more snapshot information in WebUI
 --

 Key: HDFS-5196
 URL: https://issues.apache.org/jira/browse/HDFS-5196
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Haohui Mai
Assignee: Shinichi Yamashita
Priority: Minor
 Attachments: HDFS-5196-2.patch, HDFS-5196.patch, HDFS-5196.patch, 
 HDFS-5196.patch, snapshot-new-webui.png, snapshottable-directoryList.png, 
 snapshotteddir.png


 The WebUI should provide more detailed information about snapshots, such as 
 all snapshottable directories and corresponding number of snapshots 
 (suggested in HDFS-4096).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931045#comment-13931045
 ] 

Jing Zhao commented on HDFS-6089:
-

Checked the log with Arpit. Looks like the issue is like this:
1. After NN1 got suspended, NN2 started the transition. It first tried to stop 
the editlog tailer thread.
2. The editlog tailer thread happened to trigger NN1 to roll its editlog right 
before the transition, and this rpc call got stuck since NN1 was suspended.
3. It took a relatively long time (1min) for the rollEditlog rpc call to 
receive the connection reset exception.
4. During this time, NN2 waited for the tailer thread to die, and the 
fsnamesystem lock was held by the stopStandbyService call.
5. haadmin's getServiceState request could not get response (since the lock was 
held by the transition thread in NN2) and timeout (its default socket timeout 
is 20s).

In summary, it is possible that the rollEditlog rpc call from the standby NN to 
the active NN in the editlog tailer thread may delay the NN failover.


 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao

 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931048#comment-13931048
 ] 

Jing Zhao commented on HDFS-6089:
-

Since in active NN we already have a NameNodeEditLogRoller thread triggering 
the editlog roll, I guess the standby NN doesn't need to trigger the active 
namenode to roll the editlog.

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao

 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Ted Yu (JIRA)
Ted Yu created HDFS-6092:


 Summary: DistributedFileSystem#getCanonicalServiceName() and 
DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu


I discovered this when working on HBASE-10717
Here is sample code to reproduce the problem:
{code}
Path desPath = new Path(hdfs://127.0.0.1/);
FileSystem desFs = desPath.getFileSystem(conf);

String s = desFs.getCanonicalServiceName();
URI uri = desFs.getUri();
{code}
Canonical name string contains the default port - 8020
But uri doesn't contain port.
This would result in the following exception:
{code}
testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
0.001 sec   ERROR!
java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
at 
org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Description: 
I discovered this when working on HBASE-10717
Here is sample code to reproduce the problem:
{code}
Path desPath = new Path(hdfs://127.0.0.1/);
FileSystem desFs = desPath.getFileSystem(conf);

String s = desFs.getCanonicalServiceName();
URI uri = desFs.getUri();
{code}
Canonical name string contains the default port - 8020
But uri doesn't contain port.
This would result in the following exception:
{code}
testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
0.001 sec   ERROR!
java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
at 
org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
{code}
Thanks to Brando Li who helped debug this.

  was:
I discovered this when working on HBASE-10717
Here is sample code to reproduce the problem:
{code}
Path desPath = new Path(hdfs://127.0.0.1/);
FileSystem desFs = desPath.getFileSystem(conf);

String s = desFs.getCanonicalServiceName();
URI uri = desFs.getUri();
{code}
Canonical name string contains the default port - 8020
But uri doesn't contain port.
This would result in the following exception:
{code}
testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
0.001 sec   ERROR!
java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
at 
org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
{code}


 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu

 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5732) Separate memory space between BM and NN

2014-03-11 Thread Edward Bortnikov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Bortnikov updated HDFS-5732:
---

Attachment: Remote BM.pdf

Updated design of the remote BM operation. 

 Separate memory space between BM and NN
 ---

 Key: HDFS-5732
 URL: https://issues.apache.org/jira/browse/HDFS-5732
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Amir Langer
 Attachments: 
 0002-Separation-of-BM-from-NN-Step-2-Separate-memory-spac.patch, Remote BM.pdf


 Change created APIs to not rely on the same instance being shared in both BM 
 and NN. Use immutable objects / keep state in sync.
 BM and NN will still exist in the same VM work on a new BM service as an 
 independent process is deferred to later tasks.
 Also, a one to one relation between BM and NN is assumed. 
 This task should maintain backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API

2014-03-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931082#comment-13931082
 ] 

Haohui Mai commented on HDFS-5978:
--

{code}
   echo   oiv  apply the offline fsimage viewer to an fsimage
+  echo   wiv  run the web fsimage viewer to an fsimage
{code}

It might be more better to put this functionality as a subtool of the offline 
image viewer, since it is intended to be a successor of the lsr tool.

It might be cleaner to separate the logics of loading fsimage and handling 
webhdfs requests in {{FSImageHandler}}. You can wrap the information into a 
private static class:

{code}
+  private String[] stringTable;
+  private HashMapLong, INode inodes = Maps.newHashMap();
+  private HashMapLong, long[] dirmap = Maps.newHashMap();
+  private ArrayListINodeReferenceSection.INodeReference refList =
+  Lists.newArrayList();
{code}

{code}
+public ChannelPipeline getPipeline() throws Exception {
+  ChannelPipeline pipeline = Channels.pipeline();
+  pipeline.addLast(httpDecoder, new HttpRequestDecoder());
+  pipeline.addLast(requestHandler, new FSImageHandler(inputFile));
+  pipeline.addLast(stringEncoder, new StringEncoder());
+  pipeline.addLast(httpEncoder, new HttpResponseEncoder());
+  return pipeline;
+}
+  }
{code}

You might be able to create a static pipeline instead of a pipeline factory. 
See {{setPipeline()}} for more details. I'm also unclear why {{StringEncoder}} 
is required.

{code}

+  public void testWebImageViewer() throws IOException, InterruptedException {
+final String port = 9001;

{code}

The command line needs to accept both the listening host and the port. 
Otherwise it'll listen to all interfaces by default. In unit test, it is also 
important to configure the port to zero to avoid intermediate failures.

{code}
+  // wait until the viewer starts
+  Thread.sleep(3000);
+
{code}

You can use a condition variable here instead of sleeping.

{code}
+  HttpClient client = new DefaultHttpClient();
+  HttpGet httpGet =
+  new HttpGet(http://localhost:; + port + /?op=LISTSTATUS);
+  HttpResponse response = client.execute(httpGet);
+  assertEquals(200, response.getStatusLine().getStatusCode());
+  assertEquals(application/json,
+   response.getEntity().getContentType().getValue());
{code}

Using the built-in {{HttpUrlConnection}} is sufficient. It reduces the 
dependency of the unit tests.

{code}
+import com.google.gson.JsonArray;
+import com.google.gson.JsonObject;
+import com.google.gson.JsonParser;
{code}

Please use jackson instead of gson. hadoop-hdfs does not depend on gson at all. 
That way there is no additional dependency.


 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6086:
---

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

committed, thanks!

 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Attachment: hdfs-6092-v1.txt

Tentative patch.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Status: Patch Available  (was: Open)

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5477) Block manager as a service

2014-03-11 Thread Edward Bortnikov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931095#comment-13931095
 ] 

Edward Bortnikov commented on HDFS-5477:


All great questions. Our previous documentation was pretty substandard. 
New design PDF attached at https://issues.apache.org/jira/browse/HDFS-5732 - it 
clarifies many things about the remote NN operation. 

Regarding Todd's question specifically. Yes, it's impossible to guarantee the 
100% atomicity of transactions in the face of failures. This atomicity is also 
not necessarily required as long as no data is lost. The distributed state must 
eventually converge. 

Our solution is to treat communication failures and process failures 
identically. If an RPC times out, we re-establish the NN-BM connection and 
re-synchronize the state. (There are many ways to optimize this process, e.g., 
Merkle trees). Since timeouts should be very rare in a datacenter network (in 
the absence of bugs), this treatment is not too harsh. 

 Block manager as a service
 --

 Key: HDFS-5477
 URL: https://issues.apache.org/jira/browse/HDFS-5477
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: Proposal.pdf, Proposal.pdf, Standalone BM.pdf, 
 Standalone BM.pdf, patches.tar.gz


 The block manager needs to evolve towards having the ability to run as a 
 standalone service to improve NN vertical and horizontal scalability.  The 
 goal is reducing the memory footprint of the NN proper to support larger 
 namespaces, and improve overall performance by decoupling the block manager 
 from the namespace and its lock.  Ideally, a distinct BM will be transparent 
 to clients and DNs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6086) Fix a case where zero-copy or no-checksum reads were not allowed even when the block was cached

2014-03-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931109#comment-13931109
 ] 

Hudson commented on HDFS-6086:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5308 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5308/])
HDFS-6086. Fix a case where zero-copy or no-checksum reads were not allowed 
even when the block was cached. (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576533)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/ShortCircuitReplica.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Fix a case where zero-copy or no-checksum reads were not allowed even when 
 the block was cached
 ---

 Key: HDFS-6086
 URL: https://issues.apache.org/jira/browse/HDFS-6086
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6086.001.patch, HDFS-6086.002.patch


 We need to fix a case where zero-copy or no-checksum reads are not allowed 
 even when the block is cached.  The case is when the block is cached before 
 the {{REQUEST_SHORT_CIRCUIT_FDS}} operation begins.  In this case, 
 {{DataXceiver}} needs to consult the {{ShortCircuitRegistry}} to see if the 
 block is cached, rather than relying on a callback.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931250#comment-13931250
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634035/hdfs-6092-v1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDFSClientFailover
  org.apache.hadoop.security.TestPermissionSymlinks
  org.apache.hadoop.fs.TestGlobPaths
  org.apache.hadoop.hdfs.TestFileAppend
  org.apache.hadoop.hdfs.TestReplication
  org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs
  org.apache.hadoop.hdfs.web.TestWebHDFS
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
  org.apache.hadoop.hdfs.server.namenode.TestNameNodeAcl
  org.apache.hadoop.hdfs.TestFileStatus
  org.apache.hadoop.hdfs.TestLease
  org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer
  org.apache.hadoop.hdfs.TestHDFSTrash
  org.apache.hadoop.fs.viewfs.TestViewFsDefaultValue
  org.apache.hadoop.fs.TestSymlinkHdfsDisable
  org.apache.hadoop.fs.TestHDFSFileContextMainOperations
  org.apache.hadoop.cli.TestAclCLI
  org.apache.hadoop.fs.TestUrlStreamHandler
  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.fs.TestSymlinkHdfsFileSystem
  org.apache.hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.fs.shell.TestHdfsTextCommand
  org.apache.hadoop.fs.TestResolveHdfsSymlink
  org.apache.hadoop.hdfs.TestSnapshotCommands
  org.apache.hadoop.hdfs.server.namenode.TestFileContextAcl
  org.apache.hadoop.fs.viewfs.TestViewFsFileStatusHdfs
  org.apache.hadoop.hdfs.TestDistributedFileSystem
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSClientRetries
  org.apache.hadoop.hdfs.server.balancer.TestBalancer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6375//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6375//console

This message is automatically generated.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 

[jira] [Created] (HDFS-6093) Expose more caching information for debugging by users

2014-03-11 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-6093:
-

 Summary: Expose more caching information for debugging by users
 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang


When users submit a new cache directive, it's unclear if the NN has recognized 
it and is actively trying to cache it, or if it's hung for some other reason. 
It'd be nice to expose a pending caching/uncaching count the same way we 
expose pending replication work.

It'd also be nice to display the aggregate cache capacity and usage in dfsadmin 
-report, since we already have have it as a metric and expose it per-DN in 
report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6093) Expose more caching information for debugging by users

2014-03-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6093:
--

Status: Patch Available  (was: Open)

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users

2014-03-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931265#comment-13931265
 ] 

Arpit Agarwal commented on HDFS-6093:
-

+1 for the idea, thanks Andrew! :-)

I'll try to review this tomorrow if no one else has got to it by then.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated HDFS-6080:
--

Attachment: HDFS-6080.patch

set rsize=1MB

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.000.patch

Simple patch to remove the editlog roll from SBN.

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6089.000.patch


 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6093) Expose more caching information for debugging by users

2014-03-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6093:
--

Attachment: hdfs-6093-1.patch

Patch attached.

For the DFSAdmin output, it turns out that block pool used wasn't actually 
being sent to the client, so I added it along with the cache stuff. I verified 
the dfsadmin -report output manually. I didn't modify any existing lines, just 
added new ones (with unique strings), so existing tools shouldn't be broken.

The pending cache/uncache count is now a metric and also on the webUI. Included 
test makes sure that the metric ticks up and down, and I checked manually on 
the webUI.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6038) JournalNode hardcodes NameNodeLayoutVersion in the edit log file

2014-03-11 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931234#comment-13931234
 ] 

Todd Lipcon commented on HDFS-6038:
---

I didn't look at the code in detail, but the design approach of 
length-prefixing the edits seems reasonable. My only feedback might be that it 
would have been nicer to do that change in a JIRA labeled as such, and then 
make the JN change separately. But I'm not against doing it all here -- just 
worried that other contributors may want to review this patch as it's actually 
making an edit log format change, not just a protocol change for the JNs.

I also noticed a spot or two where you are missing a finally { 
IOUtils.closeStream(...); } -- might be worth checking for that before 
committing.

In terms of testing, it might be nice to add a QJM test which writes fake ops 
to a JournalNode -- ie ops with garbage data but a correct length and checksum. 
Perhaps you could do this by changing QJMTestUtil.writeOp() to write a garbage 
op?

 JournalNode hardcodes NameNodeLayoutVersion in the edit log file
 

 Key: HDFS-6038
 URL: https://issues.apache.org/jira/browse/HDFS-6038
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: journal-node, namenode
Reporter: Haohui Mai
Assignee: Jing Zhao
 Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, 
 HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, 
 HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored


 In HA setup, the JNs receive edit logs (blob) from the NN and write into edit 
 log files. In order to write well-formed edit log files, the JNs prepend a 
 header for each edit log file.
 The problem is that the JN hard-codes the version (i.e., 
 {{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect 
 edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during 
 rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Status: Patch Available  (was: Open)

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6089.000.patch


 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931122#comment-13931122
 ] 

Hadoop QA commented on HDFS-6080:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633847/HDFS-6080.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-nfs:

  org.apache.hadoop.fs.TestHdfsNativeCodeLoader

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6374//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6374//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6374//console

This message is automatically generated.

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | 

[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.

2014-03-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931167#comment-13931167
 ] 

Chris Nauroth commented on HDFS-5516:
-

Hi, [~miradu-msft].  The patch looks good.  I think we can add unit tests in 
{{TestAuthFilter}} to cover the new configuration.  Could you please take a 
look?

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931141#comment-13931141
 ] 

Brandon Li commented on HDFS-6080:
--

Yes. Let's update the doc too. If the user doesn't change the mount option and 
the nfs client uses 32kb or 64kb instead, the 1MB setting will not take effect 
so it hurts nothing.

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931275#comment-13931275
 ] 

Hadoop QA commented on HDFS-6089:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634047/HDFS-6089.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA
  org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6376//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6376//console

This message is automatically generated.

 Standby NN while transitioning to active throws a connection refused error 
 when the prior active NN process is suspended
 

 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao
 Attachments: HDFS-6089.000.patch


 The following scenario was tested:
 * Determine Active NN and suspend the process (kill -19)
 * Wait about 60s to let the standby transition to active
 * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
 active.
 What was noticed that some times the call to get the service state of nn2 got 
 a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931307#comment-13931307
 ] 

Hadoop QA commented on HDFS-6080:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634050/HDFS-6080.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-nfs:

  org.apache.hadoop.fs.TestHdfsNativeCodeLoader
  
org.apache.hadoop.hdfs.server.datanode.fsdataset.TestAvailableSpaceVolumeChoosingPolicy

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6377//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6377//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6377//console

This message is automatically generated.

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | 

[jira] [Created] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-6094:
---

 Summary: The same block can be counted twice towards safe mode 
threshold
 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


{{BlockManager#addStoredBlock}} can cause the same block can be counted towards 
safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details in a comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-6092:
-

Attachment: hdfs-6092-v2.txt

Patch v2 passed the following tests:
{code}
  636  mvn test -Dtest=TestFileStatus
  637  mvn test -Dtest=TestWebHDFS,TestViewFileSystemHdfs,TestGlobPaths
{code}
getCanonicalServiceName() is called only when port of this.uri is -1

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt, hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold

2014-03-11 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6094:


Description: {{BlockManager#addStoredBlock}} can cause the same block can 
be counted towards safe mode threshold. We see this manifest via 
{{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
details to follow in a comment.  (was: {{BlockManager#addStoredBlock}} can 
cause the same block can be counted towards safe mode threshold. We see this 
manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on 
Ubuntu. More details in a comment.)

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-11 Thread Thanh Do (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931338#comment-13931338
 ] 

Thanh Do commented on HDFS-6009:


Hi Yu, 

You mentioned although the regionservers are grouped, the datanodes which 
store the data are not, which leads to the case that one datanode failure 
affects multiple applications, as we already observed in our product 
environment.

Can you elaborate that scenarios? I thought a datanode failure will be ok, as 
the data are replicated. 

Best,

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931375#comment-13931375
 ] 

Hadoop QA commented on HDFS-6093:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634074/hdfs-6093-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6378//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6378//console

This message is automatically generated.

 Expose more caching information for debugging by users
 --

 Key: HDFS-6093
 URL: https://issues.apache.org/jira/browse/HDFS-6093
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-6093-1.patch


 When users submit a new cache directive, it's unclear if the NN has 
 recognized it and is actively trying to cache it, or if it's hung for some 
 other reason. It'd be nice to expose a pending caching/uncaching count the 
 same way we expose pending replication work.
 It'd also be nice to display the aggregate cache capacity and usage in 
 dfsadmin -report, since we already have have it as a metric and expose it 
 per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931415#comment-13931415
 ] 

Hadoop QA commented on HDFS-6092:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634088/hdfs-6092-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6379//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6379//console

This message is automatically generated.

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: hdfs-6092-v1.txt, hdfs-6092-v2.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)