date:20140313


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932971#comment-13932971
 ] 

Akira AJISAKA commented on HDFS-5978:
-

[~wheat9], thank you for your comment.
I updated the patch to reflect the comment, and added {{FILESTATUS}} operation.
bq. I'm also unclear why {{StringEncoder}} is required.
It is required because the type of the json content is {{String}}.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


 [ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5978:


Attachment: HDFS-5978.2.patch

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933009#comment-13933009
 ] 

Hadoop QA commented on HDFS-6097:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634324/HDFS-6097.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6389//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6389//console

This message is automatically generated.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933084#comment-13933084
 ] 

Hadoop QA commented on HDFS-5978:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634387/HDFS-5978.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6390//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6390//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6390//console

This message is automatically generated.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout


[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933107#comment-13933107
 ] 

Hudson commented on HDFS-6096:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/508/])
HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java


 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 3.0.0, 2.4.0

 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work


[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933105#comment-13933105
 ] 

Hudson commented on HDFS-6079:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/508/])
HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed 
by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.4.0

 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException


[ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933110#comment-13933110
 ] 

Hudson commented on HDFS-5705:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #508 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/508/])
HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to 
branch-2 and branch-2.4. (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
 ConcurrentModificationException
 

 Key: HDFS-5705
 URL: https://issues.apache.org/jira/browse/HDFS-5705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 3.0.0, 2.4.0

 Attachments: hdfs-5705.html, hdfs-5705.txt


 From 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
  :
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
 {code}
 The above happens when shutdown() is called in parallel to addBlockPool() or 
 shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work


[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933243#comment-13933243
 ] 

Hudson commented on HDFS-6079:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/])
HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed 
by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.4.0

 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException


[ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933248#comment-13933248
 ] 

Hudson commented on HDFS-5705:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/])
HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to 
branch-2 and branch-2.4. (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
 ConcurrentModificationException
 

 Key: HDFS-5705
 URL: https://issues.apache.org/jira/browse/HDFS-5705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 3.0.0, 2.4.0

 Attachments: hdfs-5705.html, hdfs-5705.txt


 From 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
  :
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
 {code}
 The above happens when shutdown() is called in parallel to addBlockPool() or 
 shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout


[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933245#comment-13933245
 ] 

Hudson commented on HDFS-6096:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1700 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1700/])
HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java


 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 3.0.0, 2.4.0

 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6079) Timeout for getFileBlockStorageLocations does not work


[ 
https://issues.apache.org/jira/browse/HDFS-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933344#comment-13933344
 ] 

Hudson commented on HDFS-6079:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/])
HDFS-6079. Timeout for getFileBlockStorageLocations does not work. Contributed 
by Andrew Wang. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576979)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockStorageLocationUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java


 Timeout for getFileBlockStorageLocations does not work
 --

 Key: HDFS-6079
 URL: https://issues.apache.org/jira/browse/HDFS-6079
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.4.0

 Attachments: hdfs-6079-1.patch


 {{DistributedFileSystem#getFileBlockStorageLocations}} has a config value 
 which lets clients set a timeout, but it's not being enforced correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6096) TestWebHdfsTokens may timeout


[ 
https://issues.apache.org/jira/browse/HDFS-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933346#comment-13933346
 ] 

Hudson commented on HDFS-6096:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/])
HDFS-6096. TestWebHdfsTokens may timeout. (Contributed by szetszwo) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576999)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java


 TestWebHdfsTokens may timeout
 -

 Key: HDFS-6096
 URL: https://issues.apache.org/jira/browse/HDFS-6096
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 3.0.0, 2.4.0

 Attachments: h6096_20140312.patch


 The timeout of TestWebHdfsTokens is set to 1 second.  It is too short for 
 some machines.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5705) TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to ConcurrentModificationException


[ 
https://issues.apache.org/jira/browse/HDFS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933349#comment-13933349
 ] 

Hudson commented on HDFS-5705:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1725 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1725/])
HDFS-5705. Update CHANGES.txt for merging the original fix (r1555190) to 
branch-2 and branch-2.4. (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1576989)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestSecondaryNameNodeUpgrade#testChangeNsIDFails may fail due to 
 ConcurrentModificationException
 

 Key: HDFS-5705
 URL: https://issues.apache.org/jira/browse/HDFS-5705
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 3.0.0, 2.4.0

 Attachments: hdfs-5705.html, hdfs-5705.txt


 From 
 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1626/testReport/org.apache.hadoop.hdfs.server.namenode/TestSecondaryNameNodeUpgrade/testChangeNsIDFails/
  :
 {code}
 java.util.ConcurrentModificationException: null
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:251)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:218)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1414)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1309)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1464)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1439)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1423)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.doIt(TestSecondaryNameNodeUpgrade.java:97)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade.testChangeNsIDFails(TestSecondaryNameNodeUpgrade.java:116)
 {code}
 The above happens when shutdown() is called in parallel to addBlockPool() or 
 shutdownBlockPool().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-13 Thread Thanh Do (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933475#comment-13933475
]

Thanh Do commented on HDFS-6009:

Hi Yu Li,

I want to follow up on this issue. Could you please elaborate more on datanode
failure. In particular, what caused the failure in your case? Is it a disk
error, network failure, or an application is buggy?

If it is a disk error and network failure, I think isolation using datanode
group is reasonable.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in
multi-tenant deployments of HBase we prefer to specify several groups of
regionservers to serve different applications, to achieve some kind of
isolation or resource allocation. However, although the regionservers are
grouped, the datanodes which store the data are not, which leads to the case
that one datanode failure affects multiple applications, as we already
observed in our product environment.
To relieve the above issue, we could take usage of the favored node feature
(HDFS-2576) to make regionserver able to locate data within its group, or say
make datanodes also grouped (passively), to form some level of isolation.
In this case, or any other case that needs datanodes to group, we would need
a bunch of tools to maintain the group, including:
1. Making balancer able to balance data among specified servers, rather than
the whole set
2. Set balance bandwidth for specified servers, rather than the whole set
3. Some tool to check whether the block is cross-group placed, and move it
back if so
This JIRA is an umbrella for the above tools.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933570#comment-13933570
 ] 

Chris Nauroth commented on HDFS-6097:
-

The patch looks good, Colin.  Just a few small things:
# {{DFSInputStream#tryReadZeroCopy}}: It seems unnecessary to copy {{pos}} to 
{{curPos}}.  The value of {{curPos}} is never changed throughout the method, so 
it's always the same as {{pos}}.  This is a synchronized method, so I don't 
expect {{pos}} to get mutated on a different thread.
# {{TestEnhancedByteBufferAccess}}: Let's remove the commented out lines and 
the extra indentation on the {{Assert.fail}} line.  Let's use try-finally 
blocks to guarantee cleanup of {{cluster}}, {{fs}}, {{fsIn}} and {{fsIn2}}.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


 [ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-5244:
--

Summary: TestNNStorageRetentionManager#testPurgeMultipleDirs fails  (was: 
TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly 
expects Hashmap values to have order. )

 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6038) JournalNode hardcodes NameNodeLayoutVersion in the edit log file

[
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933577#comment-13933577
]

Jing Zhao commented on HDFS-6038:
-

Thanks for the review, Todd!

bq. just worried that other contributors may want to review this patch as it's
actually making an edit log format change, not just a protocol change for the
JNs.
I will update the jira title and description to make them more clear about the
changes.

bq. it might be nice to add a QJM test which writes fake ops to a JournalNode
Yeah, will update the patch to add the unit test.

JournalNode hardcodes NameNodeLayoutVersion in the edit log file

Key: HDFS-6038
URL: https://issues.apache.org/jira/browse/HDFS-6038
Project: Hadoop HDFS
Issue Type: Sub-task
Components: journal-node, namenode
Reporter: Haohui Mai
Assignee: Jing Zhao
Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch,
HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch,
HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


 [ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-5244:
--

  Resolution: Fixed
Target Version/s: 2.4.0  (was: 3.0.0, 2.1.1-beta)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I committed the patch to trunk, branch-2 and branch-2.4. Thank you [~jwang302]!

 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.4.0, 2.1.0-beta

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails


[ 
https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933594#comment-13933594
 ] 

Hudson commented on HDFS-5244:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5321 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5321/])
HDFS-5244. TestNNStorageRetentionManager#testPurgeMultipleDirs fails. 
Contributed bye Jinghui Wang. (suresh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577254)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java


 TestNNStorageRetentionManager#testPurgeMultipleDirs fails
 -

 Key: HDFS-5244
 URL: https://issues.apache.org/jira/browse/HDFS-5244
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.1.0-beta
 Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 3.0.0, 2.1.0-beta, 2.4.0

 Attachments: HDFS-5244.patch


 The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a 
 HashMap(dirRoots) to store the root storages to be mocked for the purging 
 test, which does not have any predictable order. The directories needs be 
 purged are stored in a LinkedHashSet, which has a predictable order. So, when 
 the directories get mocked for the test, they could be already out of
 the order that they were added. Thus, the order that the directories were
 actually purged and the order of them being added to the LinkedHashList could
 be different and cause the test to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933590#comment-13933590
 ] 

Brandon Li commented on HDFS-6080:
--

+1

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6038) JournalNode hardcodes NameNodeLayoutVersion in the edit log file

[
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jing Zhao updated HDFS-6038:

Description:
In HA setup, the JNs receive edit logs (blob) from the NN and write into edit
log files. In order to write well-formed edit log files, the JNs prepend a
header for each edit log file. The problem is that the JN hard-codes the
version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it
generates incorrect edit logs when the newer release bumps the
{{NameNodeLayoutVersion}} during rolling upgrade.

In the meanwhile, currently JN tries to decode the in-progress editlog segment
in order to know the last txid in the segment. In the rolling upgrade scenario,
the JN with the old software may not be able to correctly decode the editlog
generated by the new software.

This jira makes the following changes to allow JN to handle editlog produced by
software with future layoutversion:
1. Change the NN--JN startLogSegment RPC signature and let NN specify the
layoutversion for the new editlog segment.
2. Persist a length field for each editlog op to indicate the total length of
the op. Instead of calling EditLogFileInputStream#validateEditLog to get the
last txid of an in-progress editlog segment, a new method scanEditLog is added
and used by JN which does not decode each editlog op but uses the length to
quickly jump to the next op.

was:
In HA setup, the JNs receive edit logs (blob) from the NN and write into edit
log files. In order to write well-formed edit log files, the JNs prepend a
header for each edit log file.

The problem is that the JN hard-codes the version (i.e.,
{{NameNodeLayoutVersion}} in the edit log, therefore it generates incorrect
edit logs when the newer release bumps the {{NameNodeLayoutVersion}} during
rolling upgrade.

JournalNode hardcodes NameNodeLayoutVersion in the edit log file

In HA setup, the JNs receive edit logs (blob) from the NN and write into edit
log files. In order to write well-formed edit log files, the JNs prepend a
header for each edit log file. The problem is that the JN hard-codes the
version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it
generates incorrect edit logs when the newer release bumps the
{{NameNodeLayoutVersion}} during rolling upgrade.
In the meanwhile, currently JN tries to decode the in-progress editlog
segment in order to know the last txid in the segment. In the rolling upgrade
scenario, the JN with the old software may not be able to correctly decode
the editlog generated by the new software.
This jira makes the following changes to allow JN to handle editlog produced
by software with future layoutversion:
1. Change the NN--JN startLogSegment RPC signature and let NN specify the
layoutversion for the new editlog segment.
2. Persist a length field for each editlog op to indicate the total length of
the op. Instead of calling EditLogFileInputStream#validateEditLog to get the
last txid of an in-progress editlog segment, a new method scanEditLog is
added and used by JN which does not decode each editlog op but uses the
length to quickly jump to the next op.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion


 [ 
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6038:


Summary: Allow JournalNode to handle editlog produced by new release with 
future layoutversion  (was: JournalNode hardcodes NameNodeLayoutVersion in the 
edit log file)

 Allow JournalNode to handle editlog produced by new release with future 
 layoutversion
 -

 Key: HDFS-6038
 URL: https://issues.apache.org/jira/browse/HDFS-6038
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: journal-node, namenode
Reporter: Haohui Mai
Assignee: Jing Zhao
 Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, 
 HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, 
 HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, editsStored


 In HA setup, the JNs receive edit logs (blob) from the NN and write into edit 
 log files. In order to write well-formed edit log files, the JNs prepend a 
 header for each edit log file. The problem is that the JN hard-codes the 
 version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it 
 generates incorrect edit logs when the newer release bumps the 
 {{NameNodeLayoutVersion}} during rolling upgrade.
 In the meanwhile, currently JN tries to decode the in-progress editlog 
 segment in order to know the last txid in the segment. In the rolling upgrade 
 scenario, the JN with the old software may not be able to correctly decode 
 the editlog generated by the new software.
 This jira makes the following changes to allow JN to handle editlog produced 
 by software with future layoutversion:
 1. Change the NN--JN startLogSegment RPC signature and let NN specify the 
 layoutversion for the new editlog segment.
 2. Persist a length field for each editlog op to indicate the total length of 
 the op. Instead of calling EditLogFileInputStream#validateEditLog to get the 
 last txid of an in-progress editlog segment, a new method scanEditLog is 
 added and used by JN which does not decode each editlog op but uses the 
 length to quickly jump to the next op.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-13 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HDFS-6092:
---

Attachment: HDFS-6092-v4.patch

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: HDFS-6092-v4.patch, haosdent-HDFS-6092-v2.patch, 
 haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6092) DistributedFileSystem#getCanonicalServiceName() and DistributedFileSystem#getUri() may return inconsistent results w.r.t. port

2014-03-13 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933627#comment-13933627
 ] 

haosdent commented on HDFS-6092:


[~te...@apache.org] I upload HDFS-6092-v4.patch. Could you help me to review 
it? :-)

 DistributedFileSystem#getCanonicalServiceName() and 
 DistributedFileSystem#getUri() may return inconsistent results w.r.t. port
 --

 Key: HDFS-6092
 URL: https://issues.apache.org/jira/browse/HDFS-6092
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Ted Yu
 Attachments: HDFS-6092-v4.patch, haosdent-HDFS-6092-v2.patch, 
 haosdent-HDFS-6092.patch, hdfs-6092-v1.txt, hdfs-6092-v2.txt, hdfs-6092-v3.txt


 I discovered this when working on HBASE-10717
 Here is sample code to reproduce the problem:
 {code}
 Path desPath = new Path(hdfs://127.0.0.1/);
 FileSystem desFs = desPath.getFileSystem(conf);
 
 String s = desFs.getCanonicalServiceName();
 URI uri = desFs.getUri();
 {code}
 Canonical name string contains the default port - 8020
 But uri doesn't contain port.
 This would result in the following exception:
 {code}
 testIsSameHdfs(org.apache.hadoop.hbase.util.TestFSHDFSUtils)  Time elapsed: 
 0.001 sec   ERROR!
 java.lang.IllegalArgumentException: port out of range:-1
 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
 at java.net.InetSocketAddress.init(InetSocketAddress.java:224)
 at 
 org.apache.hadoop.hbase.util.FSHDFSUtils.getNNAddresses(FSHDFSUtils.java:88)
 {code}
 Thanks to Brando Li who helped debug this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

[
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933643#comment-13933643
]

Colin Patrick McCabe commented on HDFS-6097:

bq. DFSInputStream#tryReadZeroCopy: It seems unnecessary to copy pos to curPos.
The value of curPos is never changed throughout the method, so it's always the
same as pos. This is a synchronized method, so I don't expect pos to get
mutated on a different thread.

This is actually an optimization I made. I wanted to avoid making a memory
access each time, since I use this variable a lot. By copying it to a local
variable, it becomes a lot more obvious to the optimizer that it can't change.
It's possible that Java will perform this optimization automatically, but I'm
skeptical because we're calling a lot of functions here. It seems like it
would require a sophisticated optimizer to realize that there was no code path
that changed this variable.

bq. TestEnhancedByteBufferAccess: Let's remove the commented out lines and the
extra indentation on the Assert.fail line.

OK.

bq. Let's use try-finally blocks to guarantee cleanup of cluster, fs, fsIn and
fsIn2.

I guess I've started to skip doing this on unit tests. My rationale is that if
the test fails, cleanup isn't really that important (the surefire process will
simply terminate). In the meantime, try... finally blocks complicate the code
and often make it hard to see where a test originally failed. Oftentimes if
things get messed up, {{FileSystem#close}} or {{MiniDFSCluster#shutdown}} will
throw an exception. And you end up seeing this unhelpful exception rather than
the root cause of the problem displayed in the maven test output. On the other
hand, I suppose going without try... finally could encourage people to copy
flawed code, so I guess that's the counter-argument.

zero-copy reads are incorrectly disabled on file offsets above 2GB
--

Key: HDFS-6097
URL: https://issues.apache.org/jira/browse/HDFS-6097
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Attachments: HDFS-6097.003.patch

Zero-copy reads are incorrectly disabled on file offsets above 2GB due to
some code that is supposed to disable zero-copy reads on offsets in block
files greater than 2GB (because MappedByteBuffer segments are limited to that
size).

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933664#comment-13933664
 ] 

Haohui Mai commented on HDFS-5978:
--

bq. I updated the patch to reflect the comment, and added FILESTATUS operation.

I appreciate if you can separate it into a new jira. It looks to me that the 
new patch has not fully addressed the previous round of comment (yet).

bq. It is required because the type of the json content is String.

I'm yet to be convinced. The code is dumping UTF-8 string directly to the 
channel buffer. I don't quite follow why you need an extra pipeline stage to 
dump the string into the channel buffer.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933668#comment-13933668
 ] 

Colin Patrick McCabe commented on HDFS-6097:


Thanks for the review, Chris.  I'm going to put out a new version in a sec with 
the test cleanups, and with try... finally in the test.

I guess I'll bring up the try... finally issue on the mailing list at some 
point, and see what people think.  In the meantime, I'd like to get this in 
soon so we can continue testing...

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.004.patch

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: (was: HDFS-6097.004.patch)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Open  (was: Patch Available)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Patch Available  (was: Open)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.004.patch

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

[
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933818#comment-13933818
]

Chris Nauroth commented on HDFS-6097:
-

bq. This is actually an optimization I made.

I see. Thanks for explaining. Would you mind putting a comment in there?

bq. I guess I've started to skip doing this on unit tests.

I got into the try-finally habit during the Windows work. On Windows, we'd
have one test fail and leave the cluster running, because it wasn't doing
shutdown. Then, subsequent tests also would fail during initialization due to
the more pessimistic file locking behavior on Windows. The prior cluster still
held locks on the test data directory, so the subsequent tests couldn't
reformat. The subsequent tests would have passed otherwise, so this had the
effect of disrupting full test run reports with a lot of false failures. It
made it more difficult to determine exactly which test was really failing.

If the stack traces from close aren't helpful, then we can stifle them by
calling {{IOUtils#cleanup}} and passing a null logger.

FWIW, my current favorite way to do this is cluster initialization in a
{{BeforeClass}} method, cluster shutdown in an {{AfterClass}} method, and
sometimes close of individual streams or file systems in an {{After}} method
depending on what the test is doing. This reigns in the code clutter of
try-finally. It's not always convenient though if you need to change
{{Configuration}} in each test or if you need per-test isolation for some other
reason.

zero-copy reads are incorrectly disabled on file offsets above 2GB
--

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6102) Cannot load an fsimage with a very large directory

Andrew Wang created HDFS-6102:
-

 Summary: Cannot load an fsimage with a very large directory
 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Priority: Blocker


Found by [~schu] during testing. We were creating a bunch of directories in a 
single directory to blow up the fsimage size, and it ends up we hit this error 
when trying to load a very large fsimage:

{noformat}
2014-03-13 13:57:03,901 INFO 
org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
INodes.
2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
Failed to load image from 
FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
cpktTxId=00024532742)
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
at 
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at 
com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
at 
com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
at 
com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
at 
org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
at 
org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
at 
org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
at 
org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
at 52)
...
{noformat}

Some further research reveals there's a 64MB max size per PB message, which 
seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933822#comment-13933822
 ] 

Chris Nauroth commented on HDFS-6097:
-

Thanks, Colin.  I also meant to add that it's a bit less relevant in this 
patch, because we know this test won't run on Windows (at least not yet), but 
like you said it does set a precedent that someone could copy-paste into future 
tests.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933832#comment-13933832
 ] 

Chris Nauroth commented on HDFS-6097:
-

Thanks for posting v4.  Were you also going to put in the comment that copying 
{{pos}} to {{curPos}} is an optimization?  +1 after that, pending Jenkins run.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HDFS-6102) Cannot load an fsimage with a very large directory


 [ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-6102:
-

Assignee: Andrew Wang

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933856#comment-13933856
 ] 

Andrew Wang commented on HDFS-6102:
---

Doing some back of the envelope math while looking at INodeDirectorySection in 
fsimage.proto, we save a packed uint64 per child. These are varints, but let's 
assume worst case and they use the full 10 bytes. Thus, with the 64MB default 
max message size, we arrive at 6.7 million entries.

There are a couple approaches here:

- Split the directory section up into multiple messages, such that each message 
is under the limit
- Up the default from 64MB to the maximum supported value of 512MB, release 
note, and assume no one will realistically hit this
- Enforce a configurable maximum on the # of entries per directory

I think #3 is the best solution here, under the assumption that no one will 
need 6 million things in a directory. Still needs to be release noted of course.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Patch Available  (was: Open)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Attachment: HDFS-6097.005.patch

* Add a comment about the curPos optimization 
* add a few more comments to {{tryReadZeroCopy}}

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

Status: Open  (was: Patch Available)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933864#comment-13933864
 ] 

Chris Nauroth commented on HDFS-6097:
-

+1 for v5 pending Jenkins.  Thanks again, Colin.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933868#comment-13933868
 ] 

Akira AJISAKA commented on HDFS-5978:
-

bq. I appreciate if you can separate it into a new jira.
I'll separate it.
bq. I don't quite follow why you need an extra pipeline stage to dump the 
string into the channel buffer.
The pipeline stage is to encode UTF-8 String to the channel buffer. It is 
required to dump UTF-8 String directly to the channel buffer.
Or do you mean ChannelBuffer should be used instead of String to create a 
response content?

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB

[
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933874#comment-13933874
]

Colin Patrick McCabe commented on HDFS-6097:

bq. \[try-finally\]

That's a good point, I guess. I had been assuming that the cleanup wasn't
really required after a test failure, but that might not be a good assumption.
In particular, we'd like to know if the subsequent tests succeeded or failed...

bq. FWIW, my current favorite way to do this is cluster initialization in a
BeforeClass method, cluster shutdown in an AfterClass method, and sometimes
close of individual streams or file systems in an After method depending on
what the test is doing. This reigns in the code clutter of try-finally. It's
not always convenient though if you need to change Configuration in each test
or if you need per-test isolation for some other reason.

It does feel natural to use the Before method, but it also can be inflexible,
like you mentioned. I think on balance I usually prefer creating a common
function or class that I can have several test functions share. But it does
require a try... finally and some extra boilerplate. I wish there were a way
to make Before methods apply to only some test methods, or at least modify the
configuration they use.

bq. Thanks for posting v4. Were you also going to put in the comment that
copying pos to curPos is an optimization? +1 after that, pending Jenkins run.

added, thanks

zero-copy reads are incorrectly disabled on file offsets above 2GB
--

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933876#comment-13933876
 ] 

Andrew Wang commented on HDFS-6102:
---

I also did a quick audit of the rest of fsimage.proto, and I think the other 
repeated fields are okay. INodeFile has a repeated BlockProto of up to size 
30B, but we already have a default max # of blocks per file limit of 1 million 
so this should be okay (30MB  64MB).

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


 [ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5978:


Attachment: HDFS-5978.3.patch

Separated {{GETFILESTATUS}} support and fixed findbug warnings.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.

2014-03-13 Thread Miodrag Radulovic (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miodrag Radulovic updated HDFS-5516:


Attachment: HDFS-5516.patch

I have added two basic tests for simple auth configuration.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933985#comment-13933985
 ] 

Haohui Mai commented on HDFS-6102:
--

It might be sufficient to putting it into the release note. I agree with you 
that realistically it is quite unlikely to see someone put 6.7m inode as the 
direct children into a single directory.

I'm a little hesitate to introduce a new configuration just for this reason. I 
wonder, is the namespace quota offering a super set of this functionality? It 
might be more natural to enforce this in the scope of the namespace quota.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6007) Update documentation about short-circuit local reads

2014-03-13 Thread Masatake Iwasaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-6007:
---

Attachment: HDFS-6007-4.patch

attaching updated patch.
- removed the section about ZCR,
- added the description about permission on legacy SCR,
- removed the table of configurations,
- added configurations to hdfs-default.xml.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch, HDFS-6007-4.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934026#comment-13934026
 ] 

Suresh Srinivas commented on HDFS-6102:
---

bq. Enforce a configurable maximum on the # of entries per directory
I think this is reasonable. Recently we changed the default max length of file 
name allowed. We should also add reasonable limit to the number of entries in a 
directory.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage

2014-03-13 Thread Travis Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934031#comment-13934031
 ] 

Travis Thompson commented on HDFS-6084:
---

Yeah, having external links on intranet sites is usually looked down upon.  Or 
maybe force it to open in a new tab?  Either way, I don't think the page logo 
should link back to Apache, I click it constantly expecting to go back to the 
main Namenode page and remembering that's not where it goes.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Attachments: HDFS-6084.1.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934058#comment-13934058
 ] 

Haohui Mai commented on HDFS-6084:
--

Let's just remove all the links but leave the text. It seems to me that this 
solution leads to minimal confusion.

There are three external links in the current web UI. Two of them are in 
{{dfshealth.html}}, and one of them in {{explorer.html}}. [~tthompso], can you 
please submit a new patch that remove all external links? Thanks.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Attachments: HDFS-6084.1.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934062#comment-13934062
 ] 

Hadoop QA commented on HDFS-6097:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634503/HDFS-6097.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6393//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6393//console

This message is automatically generated.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


[ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934066#comment-13934066
 ] 

Brandon Li commented on HDFS-6080:
--

There are some format issues with the doc change. I've fixed them when 
committing the patch. Thank you, Abin, for the contribution! 

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


 [ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6080:
-

Attachment: HDFS-6080.patch

Uploaded the committed patch.

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Fix For: 2.4.0

 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, 
 HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6080) Improve NFS gateway performance by making rtmax and wtmax configurable


 [ 
https://issues.apache.org/jira/browse/HDFS-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-6080:
-

Fix Version/s: 2.4.0

 Improve NFS gateway performance by making rtmax and wtmax configurable
 --

 Key: HDFS-6080
 URL: https://issues.apache.org/jira/browse/HDFS-6080
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs, performance
Reporter: Abin Shahab
Assignee: Abin Shahab
 Fix For: 2.4.0

 Attachments: HDFS-6080.patch, HDFS-6080.patch, HDFS-6080.patch, 
 HDFS-6080.patch


 Right now rtmax and wtmax are hardcoded in RpcProgramNFS3. These dictate the 
 maximum read and write capacity of the server. Therefore, these affect the 
 read and write performance.
 We ran performance tests with 1mb, 100mb, and 1GB files. We noticed 
 significant performance decline with the size increase when compared to fuse. 
 We realized that the issue was with the hardcoded rtmax size(64k). 
 When we increased the rtmax to 1MB, we got a 10x improvement in performance.
 NFS reads:
 +---++---+---+---++--+
 | File  | Size   | Run 1 | Run 2 | Run 3 
 | Average| Std. Dev.|
 | testFile100Mb | 104857600  | 23.131158137  | 19.24552955   | 19.793332866  
 | 20.72334018435 | 1.7172094782219731   |
 | testFile1Gb   | 1073741824 | 219.108776636 | 201.064032255 | 217.433909843 
 | 212.5355729113 | 8.14037175506561 |
 | testFile1Mb   | 1048576| 0.330546906   | 0.256391808   | 0.28730168
 | 0.291413464667 | 0.030412987573361663 |
 +---++---+---+---++--+
 Fuse reads:
 +---++-+--+--++---+
 | File  | Size   | Run 1   | Run 2| Run 3| 
 Average| Std. Dev. |
 | testFile100Mb | 104857600  | 2.394459443 | 2.695265191  | 2.50046517   | 
 2.530063267997 | 0.12457410127142007   |
 | testFile1Gb   | 1073741824 | 25.03324924 | 24.155102554 | 24.901525525 | 
 24.69662577297 | 0.386672412437576 |
 | testFile1Mb   | 1048576| 0.271615094 | 0.270835986  | 0.271796438  | 
 0.271415839333 | 0.0004166483951065848 |
 +---++-+--+--++---+
 (NFS read after rtmax = 1MB)
 +---++--+-+--+-+-+
 | File  | Size   | Run 1| Run 2   | Run 3| 
 Average | Std. Dev.|
 | testFile100Mb | 104857600  | 3.655261869  | 3.438676067 | 3.557464787  | 
 3.550467574336  | 0.0885591069882058   |
 | testFile1Gb   | 1073741824 | 34.663612417 | 37.32089122 | 37.997718857 | 
 36.66074083135  | 1.4389615098060426   |
 | testFile1Mb   | 1048576| 0.115602858  | 0.106826253 | 0.125229976  | 
 0.1158863623334 | 0.007515962395481867 |
 +---++--+-+--+-+-+



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934091#comment-13934091
 ] 

Andrew Wang commented on HDFS-6102:
---

Thanks for the comments, Haohui and Suresh. I think this is actually easier 
than I thought, since there's already a config parameter to limit directory 
size (dfs.namenode.fs-limits.max-directory-items). If we just change the 
default to 1024*1024 or something, that might be enough.

I'm currently reading through the code to make sure it works and doing manual 
testing, will post a (likely trivial) patch soon.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker

 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


[ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934129#comment-13934129
 ] 

Hadoop QA commented on HDFS-6097:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634509/HDFS-6097.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6394//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6394//console

This message is automatically generated.

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion

[
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jing Zhao updated HDFS-6038:

Attachment: HDFS-6038.008.patch

Update the patch to address Todd's comments. The main change is to add a new
unit test in TestJournal. In the new test we writes some editlog that JNs
cannot decode, and verifies that the JN can utilize the length field to scan
the segment.

Allow JournalNode to handle editlog produced by new release with future
layoutversion
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion


 [ 
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6038:


Status: Patch Available  (was: Open)

 Allow JournalNode to handle editlog produced by new release with future 
 layoutversion
 -

 Key: HDFS-6038
 URL: https://issues.apache.org/jira/browse/HDFS-6038
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: journal-node, namenode
Reporter: Haohui Mai
Assignee: Jing Zhao
 Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, 
 HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, 
 HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, 
 HDFS-6038.008.patch, editsStored


 In HA setup, the JNs receive edit logs (blob) from the NN and write into edit 
 log files. In order to write well-formed edit log files, the JNs prepend a 
 header for each edit log file. The problem is that the JN hard-codes the 
 version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it 
 generates incorrect edit logs when the newer release bumps the 
 {{NameNodeLayoutVersion}} during rolling upgrade.
 In the meanwhile, currently JN tries to decode the in-progress editlog 
 segment in order to know the last txid in the segment. In the rolling upgrade 
 scenario, the JN with the old software may not be able to correctly decode 
 the editlog generated by the new software.
 This jira makes the following changes to allow JN to handle editlog produced 
 by software with future layoutversion:
 1. Change the NN--JN startLogSegment RPC signature and let NN specify the 
 layoutversion for the new editlog segment.
 2. Persist a length field for each editlog op to indicate the total length of 
 the op. Instead of calling EditLogFileInputStream#validateEditLog to get the 
 last txid of an in-progress editlog segment, a new method scanEditLog is 
 added and used by JN which does not decode each editlog op but uses the 
 length to quickly jump to the next op.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


 [ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-6084:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~tthompso] 
for the contribution.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934201#comment-13934201
 ] 

Hadoop QA commented on HDFS-5978:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634515/HDFS-5978.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6395//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6395//console

This message is automatically generated.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-6099:


Attachment: HDFS-6099.1.patch

The attached patch checks max path component length and max children during 
renames.  I reworked {{TestFsLimits}} quite a bit to do real file system 
operations instead of directly accessing private {{FSDirectory}} methods.  That 
helped me write the new rename tests, and it also ends up covering more of the 
real {{FSDirectory}} code.

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-6099.1.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-6099:


Status: Patch Available  (was: Open)

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-6099.1.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-6099:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2, and branch-2.4. Thanks [~cnauroth] 
for the contribution.

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.4.0

 Attachments: HDFS-6099.1.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Issue Comment Deleted] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-6099:
--

Comment: was deleted

(was: I've committed the patch to trunk, branch-2, and branch-2.4. Thanks 
[~cnauroth] for the contribution.)

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.4.0

 Attachments: HDFS-6099.1.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-6099:


Status: Patch Available  (was: Reopened)

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.4.0

 Attachments: HDFS-6099.1.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6097) zero-copy reads are incorrectly disabled on file offsets above 2GB


 [ 
https://issues.apache.org/jira/browse/HDFS-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6097:
---

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

 zero-copy reads are incorrectly disabled on file offsets above 2GB
 --

 Key: HDFS-6097
 URL: https://issues.apache.org/jira/browse/HDFS-6097
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.4.0

 Attachments: HDFS-6097.003.patch, HDFS-6097.004.patch, 
 HDFS-6097.005.patch


 Zero-copy reads are incorrectly disabled on file offsets above 2GB due to 
 some code that is supposed to disable zero-copy reads on offsets in block 
 files greater than 2GB (because MappedByteBuffer segments are limited to that 
 size).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


 [ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reopened HDFS-6084:
---


Sorry for accidentally resolving this jira. Reopening it.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Issue Comment Deleted] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


 [ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-6084:
--

Comment: was deleted

(was: I've committed the patch to trunk, branch-2, and branch-2.4. Thanks 
[~tthompso] for the contribution.)

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934255#comment-13934255
 ] 

Haohui Mai commented on HDFS-6084:
--

Looks good to me. +1 pending jenkins.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5978) Create a tool to take fsimage and expose read-only WebHDFS API


[ 
https://issues.apache.org/jira/browse/HDFS-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934254#comment-13934254
 ] 

Akira AJISAKA commented on HDFS-5978:
-

The test failure is reported by HDFS-5997 and looks unrelated to the patch.

 Create a tool to take fsimage and expose read-only WebHDFS API
 --

 Key: HDFS-5978
 URL: https://issues.apache.org/jira/browse/HDFS-5978
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: tools
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: HDFS-5978.2.patch, HDFS-5978.3.patch, HDFS-5978.patch


 Suggested in HDFS-5975.
 Add an option to exposes the read-only version of WebHDFS API for 
 OfflineImageViewer. You can imagine it looks very similar to jhat.
 That way we can allow the operator to use the existing command-line tool, or 
 even the web UI to debug the fsimage. It also allows the operator to 
 interactively browsing the file system, figuring out what goes wrong.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6099) HDFS file system limits not enforced on renames.


 [ 
https://issues.apache.org/jira/browse/HDFS-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-6099:


Attachment: HDFS-6099.2.patch

I'm attaching patch v2 with one more small change.  I added 
{{PathComponentTooLongException}} and {{MaxDirectoryItemsExceededException}} to 
the terse exceptions list.  These are ultimately caused by bad client requests, 
so there isn't any value in writing the full stack trace to the NameNode logs.

 HDFS file system limits not enforced on renames.
 

 Key: HDFS-6099
 URL: https://issues.apache.org/jira/browse/HDFS-6099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.3.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.4.0

 Attachments: HDFS-6099.1.patch, HDFS-6099.2.patch


 {{dfs.namenode.fs-limits.max-component-length}} and 
 {{dfs.namenode.fs-limits.max-directory-items}} are not enforced on the 
 destination path during rename operations.  This means that it's still 
 possible to create files that violate these limits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory


 [ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6102:
--

Attachment: hdfs-6102-1.patch

Patch attached. It's dead simple, just ups the default in DFSConfigKeys and 
hdfs-default.xml, and adds some notes. I also took the opportunity to set the 
max component limit in DFSConfigKeys, since I noticed that HDFS-6055 didn't do 
that.

I manually tested by adding a million dirs to a dir, and we hit the limit. NN 
was able to startup again afterwards, and the fsimage itself was only 78MB 
(most of that probably going to the INode names). I think this is best case, 
not worst case, since IIRC the inode numbers start low and count up, but if 
someone wants to verify my envelope math I think it's good to go.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory


 [ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6102:
--

Status: Patch Available  (was: Open)

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934262#comment-13934262
 ] 

Haohui Mai commented on HDFS-6102:
--

It looks mostly good to me. The only comment I have is that the code should no 
longer support unlimited number of children in a directory, which is in 
{{FSDirectory#verifyMaxDirItems()}}.

{code}
if (maxDirItems == 0) {
  return;
}
{code}

Otherwise users might run into a problem that the saved fsimage cannot be 
consumed. Do you think it is a good idea to enforce a maximum limit, say, 6.7m 
based on your calculation?


 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.


 [ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5516:


Target Version/s: 3.0.0, 1-win, 1.3.0, 2.4.0  (was: 3.0.0, 1-win, 1.3.0)
Hadoop Flags: Reviewed
  Status: Patch Available  (was: In Progress)

+1 for the patch.  Thanks for adding the tests.  I'm clicking the Submit Patch 
button to give it a Jenkins test run.

I see target versions were set to 1.x too.  Do you want to attach a patch for 
branch-1?

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.2.0, 1.2.1, 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6102) Cannot load an fsimage with a very large directory


 [ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6102:
--

Attachment: hdfs-6102-2.patch

Good idea Haohui, new patch adds some precondition checks and removes that if 
statement. Also a new test for the preconditions.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934307#comment-13934307
 ] 

Haohui Mai commented on HDFS-6102:
--

+1 pending jenkins

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads


[ 
https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934314#comment-13934314
 ] 

Hadoop QA commented on HDFS-6007:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634538/HDFS-6007-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6396//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6396//console

This message is automatically generated.

 Update documentation about short-circuit local reads
 

 Key: HDFS-6007
 URL: https://issues.apache.org/jira/browse/HDFS-6007
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, 
 HDFS-6007-3.patch, HDFS-6007-4.patch


 updating the contents of HDFS SHort-Circuit Local Reads based on the 
 changes in HDFS-4538 and HDFS-4953.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-13 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934316#comment-13934316
]

Konstantin Shvachko commented on HDFS-6087:
---

Not sure I fully understood what you propose. So please feel free to correct if
I am wrong.
# Sounds like you propose to update blockID every time the pipeline fails and
that will guarantee block immutability.
Isn't that similar to how current HDFS uses generationStamp? When pipeline
fails HDFS increments genStamp making previously created replicas outdated.
# Seems you propose to introduce an extra commitBlock() call to NN.
Current HDFS has similar logic. Block commit is incorporated with addBlock()
and complete() calls. E.g. addBlock() changes state to committed of the
previous block of the file and then allocates the new one.
# Don't see how you get rid of lease recovery. The purpose of which is to
reconcile different replicas of the incomplete last block, as they can have
different lengths or genStamps on different DNs, as the results of the client
or DNs failure in the middle of a data transfer.
If you propose to discard uncommitted blocks entirely, then it will break
current semantics, which states that if a byte was read by a client once it
should be readable by other clients as well.
# I guess it boils down to that your diagrams show regular work-flow, but don't
consider failure scenarios.

Unify HDFS write/append/truncate

Key: HDFS-6087
URL: https://issues.apache.org/jira/browse/HDFS-6087
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client
Reporter: Guo Ruijing
Attachments: HDFS Design Proposal.pdf

In existing implementation, HDFS file can be appended and HDFS block can be
reopened for append. This design will introduce complexity including lease
recovery. If we design HDFS block as immutable, it will be very simple for
append truncate. The idea is that HDFS block is immutable if the block is
committed to namenode. If the block is not committed to namenode, it is HDFS
client’s responsibility to re-added with new block ID.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode


 [ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6100:
-

Attachment: HDFS-6100.000.patch

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 While running slive with a webhdfs file system reducers fail as they keep 
 trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode


 [ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6100:
-

Summary: DataNodeWebHdfsMethods does not failover in HA mode  (was: webhdfs 
filesystem does not failover in HA mode)

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 While running slive with a webhdfs file system reducers fail as they keep 
 trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode


 [ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6100:
-

Status: Patch Available  (was: Open)

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 While running slive with a webhdfs file system reducers fail as they keep 
 trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode


 [ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6100:
-

Description: In {{DataNodeWebHdfsMethods}}, the code creates a 
{{DFSClient}} to connect to the NN, so that it can access the files in the 
cluster. {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to 
locate the NN. Currently the parameter is set by the NN and it is a host-ip 
pair, which does not support HA.  (was: While running slive with a webhdfs file 
system reducers fail as they keep trying to write to standby namenode.)

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to 
 the NN, so that it can access the files in the cluster. 
 {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate 
 the NN. Currently the parameter is set by the NN and it is a host-ip pair, 
 which does not support HA.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934349#comment-13934349
 ] 

Hadoop QA commented on HDFS-6084:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634558/HDFS-6084.2.patch.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestCheckpoint

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6397//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6397//console

This message is automatically generated.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6100) DataNodeWebHdfsMethods does not failover in HA mode


[ 
https://issues.apache.org/jira/browse/HDFS-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934360#comment-13934360
 ] 

Haohui Mai commented on HDFS-6100:
--

The v0 patch overloads the meaning of the URL parameter {{namenoderpcaddress}}. 
It is the host-port pair of the NN in non-HA mode, but it becomes the 
nameservice id in HA mode.

 DataNodeWebHdfsMethods does not failover in HA mode
 ---

 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai
 Attachments: HDFS-6100.000.patch


 In {{DataNodeWebHdfsMethods}}, the code creates a {{DFSClient}} to connect to 
 the NN, so that it can access the files in the cluster. 
 {{DataNodeWebHdfsMethods}} relies on the address passed in the URL to locate 
 the NN. Currently the parameter is set by the NN and it is a host-ip pair, 
 which does not support HA.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6038) Allow JournalNode to handle editlog produced by new release with future layoutversion


[ 
https://issues.apache.org/jira/browse/HDFS-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934370#comment-13934370
 ] 

Hadoop QA commented on HDFS-6038:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634559/HDFS-6038.008.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 16 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6398//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6398//console

This message is automatically generated.

 Allow JournalNode to handle editlog produced by new release with future 
 layoutversion
 -

 Key: HDFS-6038
 URL: https://issues.apache.org/jira/browse/HDFS-6038
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: journal-node, namenode
Reporter: Haohui Mai
Assignee: Jing Zhao
 Attachments: HDFS-6038.000.patch, HDFS-6038.001.patch, 
 HDFS-6038.002.patch, HDFS-6038.003.patch, HDFS-6038.004.patch, 
 HDFS-6038.005.patch, HDFS-6038.006.patch, HDFS-6038.007.patch, 
 HDFS-6038.008.patch, editsStored


 In HA setup, the JNs receive edit logs (blob) from the NN and write into edit 
 log files. In order to write well-formed edit log files, the JNs prepend a 
 header for each edit log file. The problem is that the JN hard-codes the 
 version (i.e., {{NameNodeLayoutVersion}} in the edit log, therefore it 
 generates incorrect edit logs when the newer release bumps the 
 {{NameNodeLayoutVersion}} during rolling upgrade.
 In the meanwhile, currently JN tries to decode the in-progress editlog 
 segment in order to know the last txid in the segment. In the rolling upgrade 
 scenario, the JN with the old software may not be able to correctly decode 
 the editlog generated by the new software.
 This jira makes the following changes to allow JN to handle editlog produced 
 by software with future layoutversion:
 1. Change the NN--JN startLogSegment RPC signature and let NN specify the 
 layoutversion for the new editlog segment.
 2. Persist a length field for each editlog op to indicate the total length of 
 the op. Instead of calling EditLogFileInputStream#validateEditLog to get the 
 last txid of an in-progress editlog segment, a new method scanEditLog is 
 added and used by JN which does not decode each editlog op but uses the 
 length to quickly jump to the next op.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5516) WebHDFS does not require user name when anonymous http requests are disallowed.


[ 
https://issues.apache.org/jira/browse/HDFS-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934376#comment-13934376
 ] 

Hadoop QA commented on HDFS-5516:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634527/HDFS-5516.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6401//console

This message is automatically generated.

 WebHDFS does not require user name when anonymous http requests are 
 disallowed.
 ---

 Key: HDFS-5516
 URL: https://issues.apache.org/jira/browse/HDFS-5516
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0, 1.2.1, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-5516.patch, HDFS-5516.patch


 WebHDFS requests do not require user name to be specified in the request URL 
 even when in core-site configuration options HTTP authentication is set to 
 simple, and anonymous authentication is disabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934384#comment-13934384
 ] 

Jing Zhao commented on HDFS-6094:
-

I can also reproduce the issue on my local machine. Looks like the issue is:
1. After the standby NN restarts, DN1 sends first the incremental block report 
then the complete block report to SBN.
2. DN2 sends the incremental block report to SBN. This block report will not 
change the replica number in SBN because the corresponding storage ID has not 
been added in SBN yet (the storage ID will only be added during the full block 
report processing). However, the SBN still checks the current live replica 
number (which is 1 because SBN already received the full block report from DN1) 
and use the number to update the safe block count.

So maybe a simple fix can be:
{code}
@@ -2277,7 +2277,7 @@ private Block addStoredBlock(final BlockInfo block,
 if(storedBlock.getBlockUCState() == BlockUCState.COMMITTED 
 numLiveReplicas = minReplication) {
   storedBlock = completeBlock(bc, storedBlock, false);
-} else if (storedBlock.isComplete()) {
+} else if (storedBlock.isComplete()  added) {
   // check whether safe replication is reached for the block
   // only complete blocks are counted towards that
   // Is no-op if not in safe mode.
{code}

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


 [ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-6084.
--

Resolution: Fixed

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934388#comment-13934388
 ] 

Haohui Mai commented on HDFS-6084:
--

I've committed the patch to trunk, branch-2 and branch-2.4. Thanks [~tthompso] 
for the contribution.

 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6084) Namenode UI - Hadoop logo link shouldn't go to hadoop homepage


[ 
https://issues.apache.org/jira/browse/HDFS-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934398#comment-13934398
 ] 

Hudson commented on HDFS-6084:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5325 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5325/])
HDFS-6084. Namenode UI - Hadoop logo link shouldn't go to hadoop homepage. 
Contributed by Travis Thompson. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1577401)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.html


 Namenode UI - Hadoop logo link shouldn't go to hadoop homepage
 

 Key: HDFS-6084
 URL: https://issues.apache.org/jira/browse/HDFS-6084
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.3.0
Reporter: Travis Thompson
Assignee: Travis Thompson
Priority: Minor
 Fix For: 2.4.0

 Attachments: HDFS-6084.1.patch.txt, HDFS-6084.2.patch.txt


 When clicking the Hadoop title the user is taken to the Hadoop homepage, 
 which feels unintuitive.  There's already a link at the bottom where it's 
 always been, which is reasonable.  I think that the title should go to the 
 main Namenode page, #tab-overview.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934407#comment-13934407
 ] 

Jing Zhao commented on HDFS-6094:
-

Another option is to add new storage id even for incremental block report. 
[~arpitagarwal], what do you think?

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6094) The same block can be counted twice towards safe mode threshold


 [ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6094:


Attachment: TestHASafeMode-output.txt

Attach the log of the test that reproduced the failure. I injected an exception 
for each increment of safe block count.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6103) FSImage file system image version check throw a (slightly) wrong parameter.

2014-03-13 Thread jun aoki (JIRA)

jun aoki created HDFS-6103:
--

 Summary: FSImage file system image version check throw a 
(slightly) wrong parameter.
 Key: HDFS-6103
 URL: https://issues.apache.org/jira/browse/HDFS-6103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: jun aoki
Priority: Trivial


Trivial error message issue:
When upgrading hdfs, say from 2.0.5 to 2.2.0, users will need to start namenode 
with upgrade option.
e.g. 
{code}
sudo service namenode upgrade
{code}

That said, the actual error while without the option said -upgrade (with a 
hyphen) 
{code}
2014-03-13 23:38:15,488 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Exception in namenode join
java.io.IOException:
File system image contains an old layout version -40.
An upgrade to version -47 is required.
Please restart NameNode with -upgrade option.
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:221)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:787)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:568)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:684)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:669)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
2014-03-13 23:38:15,492 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1
2014-03-13 23:38:15,493 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at nn1/192.168.2.202
/
~
{code}

I'm referring to 2.0.5 above, 
https://github.com/apache/hadoop-common/blob/branch-2.0.5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L225

I haven't tried the trunk but it seems to return UPGRADE (all upper case) 
which again anther slightly wrong error description.

https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L232




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold


[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934472#comment-13934472
 ] 

Jing Zhao commented on HDFS-6094:
-

Maybe another issue with the current code is that when an incremental block 
report comes before the full block report, if the stored block state is 
COMMITTED, we may increase the safemode total block number while not increase 
the safe block count. In that case I'm not sure if the NN can get stuck in the 
safemode.

 The same block can be counted twice towards safe mode threshold
 ---

 Key: HDFS-6094
 URL: https://issues.apache.org/jira/browse/HDFS-6094
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: TestHASafeMode-output.txt


 {{BlockManager#addStoredBlock}} can cause the same block can be counted 
 towards safe mode threshold. We see this manifest via 
 {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
 details to follow in a comment.
 Exception details:
 {code}
   Time elapsed: 12.874 sec   FAILURE!
 java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
 blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
 live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
 off automatically in 28 seconds.'
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate

2014-03-13 Thread Guo Ruijing (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934485#comment-13934485
]

Guo Ruijing commented on HDFS-6087:
---

I plan to remove snapshot part and add one work-flow for write/append/truncate
and more work-flow for exception handle in design proposal.

The basic idea:

1) block is immutable. if block is committed to NN, we can copy the bock
instead of append the block and commit to NN.

2) before block is committed to NN, it is client's repsonsibility to readd it
if fails and other client cannot read that block. so we don't need
generationStamp to recover the block.

3) after block is committed to NN, file length is updated in NN so that client
cannot see uncommitted block.

4) write/append/truncate have same logic.

1. Update BlockID before commit failure including pipeline failure. The design
proposal try to remove generationStamp.

2. extra copyBlock(oldBlockID, newBlockID, length) is used for append and
truncate.

3. commitBlock a) block will be immutable b) remove all blocks after offset to
implement truncate append 3) update file length.

4. if block is not committed to namenode, file length is not updated and client
cannot read the block.

5. I will add more failure scenarios

Unify HDFS write/append/truncate

Key: HDFS-6087
URL: https://issues.apache.org/jira/browse/HDFS-6087
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client
Reporter: Guo Ruijing
Attachments: HDFS Design Proposal.pdf

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6102) Cannot load an fsimage with a very large directory


[ 
https://issues.apache.org/jira/browse/HDFS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934492#comment-13934492
 ] 

Hadoop QA commented on HDFS-6102:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634587/hdfs-6102-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6399//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6399//console

This message is automatically generated.

 Cannot load an fsimage with a very large directory
 --

 Key: HDFS-6102
 URL: https://issues.apache.org/jira/browse/HDFS-6102
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Blocker
 Attachments: hdfs-6102-1.patch, hdfs-6102-2.patch


 Found by [~schu] during testing. We were creating a bunch of directories in a 
 single directory to blow up the fsimage size, and it ends up we hit this 
 error when trying to load a very large fsimage:
 {noformat}
 2014-03-13 13:57:03,901 INFO 
 org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 24523605 
 INodes.
 2014-03-13 13:57:59,038 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Failed to load image from 
 FSImageFile(file=/dfs/nn/current/fsimage_00024532742, 
 cpktTxId=00024532742)
 com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
 large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
 the size limit.
 at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 at 
 com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
 at 
 com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
 at 
 com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
 at 
 com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9839)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry.init(FsImageProto.java:9770)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9901)
 at 
 org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeDirectorySection$DirEntry$1.parsePartialFrom(FsImageProto.java:9896)
 at 52)
 ...
 {noformat}
 Some further research reveals there's a 64MB max size per PB message, which 
 seems to be what we're hitting here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6099) HDFS file system limits not enforced on renames.