[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267543#comment-14267543
 ] 

Hudson commented on HDFS-7564:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #800 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/800/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267661#comment-14267661
 ] 

Hudson commented on HDFS-7564:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1998 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1998/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267537#comment-14267537
 ] 

Hudson commented on HDFS-7564:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/66/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)
JichengSong created HDFS-7592:
-

 Summary: A bug in BlocksMap that  cause NameNode  memory leak.
 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong


In our HDFS production environment, NameNode FGC frequently after running for 2 
months, we have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
num #instances #bytes class name
--
    1: 59262275 3613989480 [Ljava.lang.Object;
    ...
    10: 8549361 615553992 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    11: 5941511 427788792 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
num #instances #bytes class name
--
     1: 44188391 2934099616 [Ljava.lang.Object;
  ...
    23: 721763 51966936 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    24: 620028 44642016 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before 
restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being 
written. But the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in 
NameNode.
We fixed the bug as followsing patch.

--- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (版本 
1640066)
+++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (工作副本)
@@ -205,6 +205,8 @@
   DatanodeDescriptor dn = currentBlock.getDatanode(idx);
   dn.replaceBlock(currentBlock, newBlock);
 }
+// change to fix bug about memory leak of NameNode
+map.remove(newBlock);
 // replace block in the map itself
 map.put(newBlock, newBlock);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java

2015-01-07 Thread Ayappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayappan updated HDFS-4681:
--
Affects Version/s: (was: 2.5.1)
   (was: 2.4.1)
   (was: 2.5.0)
   2.5.2

 TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails 
 using IBM java
 -

 Key: HDFS-4681
 URL: https://issues.apache.org/jira/browse/HDFS-4681
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.5.2
Reporter: Tian Hong Wang
Assignee: Ayappan
  Labels: patch
 Attachments: HDFS-4681-v1.patch, HDFS-4681.patch


 TestBlocksWithNotEnoughRacks unit test fails with the following error message:
 
 testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks)
   Time elapsed: 8997 sec   FAILURE!
 org.junit.ComparisonFailure: Corrupt replica 
 expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g
  �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���   
 oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
 ��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but 
 was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g
  �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���  
 oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�]
 at org.junit.Assert.assertEquals(Assert.java:123)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java

2015-01-07 Thread Ayappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayappan updated HDFS-4681:
--
Target Version/s:   (was: 2.1.0-beta, 2.6.0)

 TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails 
 using IBM java
 -

 Key: HDFS-4681
 URL: https://issues.apache.org/jira/browse/HDFS-4681
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.5.2
Reporter: Tian Hong Wang
Assignee: Ayappan
  Labels: patch
 Attachments: HDFS-4681-v1.patch, HDFS-4681.patch


 TestBlocksWithNotEnoughRacks unit test fails with the following error message:
 
 testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks)
   Time elapsed: 8997 sec   FAILURE!
 org.junit.ComparisonFailure: Corrupt replica 
 expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g
  �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���   
 oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
 ��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but 
 was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g
  �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���  
 oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�]
 at org.junit.Assert.assertEquals(Assert.java:123)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2219) Fsck should work with fully qualified file paths.

2015-01-07 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-2219:
--
Component/s: tools
   Assignee: Tsz Wo Nicholas Sze

 Fsck should work with fully qualified file paths.
 -

 Key: HDFS-2219
 URL: https://issues.apache.org/jira/browse/HDFS-2219
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 0.23.0
Reporter: Jitendra Nath Pandey
Assignee: Tsz Wo Nicholas Sze
Priority: Minor

 Fsck takes absolute paths, but doesn't work with fully qualified file path 
 URIs. In a federated cluster with multiple namenodes, it will be useful to be 
 able to specify a file path for any namenode using its fully qualified path. 
 Currently, a non-default file system can be specified using -fs option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JichengSong updated HDFS-7592:
--
Attachment: blocksmap-2015-01-08.patch

We fixed the bug as followsing patch.

 A bug in BlocksMap that  cause NameNode  memory leak.
 -

 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
  Labels: BlocksMap, leak, memory
 Attachments: blocksmap-2015-01-08.patch


 In our HDFS production environment, NameNode FGC frequently after running for 
 2 months, we have to restart NameNode manually.
 We dumped NameNode's Heap for objects statistics.
 Before restarting NameNode:
 num #instances #bytes class name
 --
     1: 59262275 3613989480 [Ljava.lang.Object;
     ...
     10: 8549361 615553992 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     11: 5941511 427788792 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 After restarting NameNode:
 num #instances #bytes class name
 --
      1: 44188391 2934099616 [Ljava.lang.Object;
   ...
     23: 721763 51966936 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     24: 620028 44642016 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 We find the number of BlockInfoUnderConstruction is abnormally large before 
 restarting NameNode.
 As we know, BlockInfoUnderConstruction keeps block state when the file is 
 being written. But the write pressure of
 our cluster is far less than million/sec. We think there is a memory leak in 
 NameNode.
 We fixed the bug as followsing patch.
 --- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  
 (reversion 1640066)
 +++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
 @@ -205,6 +205,8 @@
DatanodeDescriptor dn = currentBlock.getDatanode(idx);
dn.replaceBlock(currentBlock, newBlock);
  }
 +// change to fix bug about memory leak of NameNode
 +map.remove(newBlock);
  // replace block in the map itself
  map.put(newBlock, newBlock);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JichengSong updated HDFS-7592:
--
Description: 
In our HDFS production environment, NameNode FGC frequently after running for 2 
months, we have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
num #instances #bytes class name
--
    1: 59262275 3613989480 [Ljava.lang.Object;
    ...
    10: 8549361 615553992 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    11: 5941511 427788792 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
num #instances #bytes class name
--
     1: 44188391 2934099616 [Ljava.lang.Object;
  ...
    23: 721763 51966936 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    24: 620028 44642016 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before 
restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being 
written. But the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in 
NameNode.
We fixed the bug as followsing patch.

--- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  
(reversion 1640066)
+++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java
@@ -205,6 +205,8 @@
   DatanodeDescriptor dn = currentBlock.getDatanode(idx);
   dn.replaceBlock(currentBlock, newBlock);
 }
+// change to fix bug about memory leak of NameNode
+map.remove(newBlock);
 // replace block in the map itself
 map.put(newBlock, newBlock);

  was:
In our HDFS production environment, NameNode FGC frequently after running for 2 
months, we have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
num #instances #bytes class name
--
    1: 59262275 3613989480 [Ljava.lang.Object;
    ...
    10: 8549361 615553992 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    11: 5941511 427788792 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
num #instances #bytes class name
--
     1: 44188391 2934099616 [Ljava.lang.Object;
  ...
    23: 721763 51966936 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
    24: 620028 44642016 
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before 
restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being 
written. But the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in 
NameNode.
We fixed the bug as followsing patch.

--- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (版本 
1640066)
+++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (工作副本)
@@ -205,6 +205,8 @@
   DatanodeDescriptor dn = currentBlock.getDatanode(idx);
   dn.replaceBlock(currentBlock, newBlock);
 }
+// change to fix bug about memory leak of NameNode
+map.remove(newBlock);
 // replace block in the map itself
 map.put(newBlock, newBlock);


 A bug in BlocksMap that  cause NameNode  memory leak.
 -

 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
  Labels: BlocksMap, leak, memory

 In our HDFS production environment, NameNode FGC frequently after running for 
 2 months, we have to restart NameNode manually.
 We dumped NameNode's Heap for objects statistics.
 Before restarting NameNode:
 num #instances #bytes class name
 --
     1: 59262275 3613989480 [Ljava.lang.Object;
     ...
     10: 8549361 615553992 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     11: 5941511 427788792 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 After restarting NameNode:
 num #instances #bytes class name
 --
      1: 44188391 2934099616 [Ljava.lang.Object;
   ...
     23: 721763 51966936 
 

[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Target Version/s: 3.0.0, trunk-win
  Status: Patch Available  (was: In Progress)

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3296) Running libhdfs tests in mac fails

2015-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268792#comment-14268792
 ] 

Hadoop QA commented on HDFS-3296:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690681/HDFS-3296.001.patch
  against trunk revision ef237bd.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.balancer.TestBalancer

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9151//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9151//console

This message is automatically generated.

 Running libhdfs tests in mac fails
 --

 Key: HDFS-3296
 URL: https://issues.apache.org/jira/browse/HDFS-3296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Reporter: Amareshwari Sriramadasu
Assignee: Chris Nauroth
 Attachments: HDFS-3296.001.patch


 Running ant -Dcompile.c++=true -Dlibhdfs=true test-c++-libhdfs on Mac fails 
 with following error:
 {noformat}
  [exec] dyld: lazy symbol binding failed: Symbol not found: 
 _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] dyld: Symbol not found: _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] 
 /Users/amareshwari.sr/workspace/hadoop/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 122: 39485 Trace/BPT trap: 5   CLASSPATH=$HADOOP_CONF_DIR:$CLASSPATH 
 LD_PRELOAD=$LIB_JVM_DIR/libjvm.so:$LIBHDFS_INSTALL_DIR/libhdfs.so: 
 $LIBHDFS_BUILD_DIR/$HDFS_TEST
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JichengSong updated HDFS-7592:
--
Labels: BlocksMap MemoryLeak  (was: MemoryLeak)

 A bug in BlocksMap that  cause NameNode  memory leak.
 -

 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
  Labels: BlocksMap, MemoryLeak

 In our HDFS production environment, NameNode FGC frequently after running for 
 2 months, we have to restart NameNode manually.
 We dumped NameNode's Heap for objects statistics.
 Before restarting NameNode:
 num #instances #bytes class name
 --
     1: 59262275 3613989480 [Ljava.lang.Object;
     ...
     10: 8549361 615553992 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     11: 5941511 427788792 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 After restarting NameNode:
 num #instances #bytes class name
 --
      1: 44188391 2934099616 [Ljava.lang.Object;
   ...
     23: 721763 51966936 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     24: 620028 44642016 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 We find the number of BlockInfoUnderConstruction is abnormally large before 
 restarting NameNode.
 As we know, BlockInfoUnderConstruction keeps block state when the file is 
 being written. But the write pressure of
 our cluster is far less than million/sec. We think there is a memory leak in 
 NameNode.
 We fixed the bug as followsing patch.
 --- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (版本 
 1640066)
 +++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (工作副本)
 @@ -205,6 +205,8 @@
DatanodeDescriptor dn = currentBlock.getDatanode(idx);
dn.replaceBlock(currentBlock, newBlock);
  }
 +// change to fix bug about memory leak of NameNode
 +map.remove(newBlock);
  // replace block in the map itself
  map.put(newBlock, newBlock);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JichengSong updated HDFS-7592:
--
Labels: MemoryLeak  (was: )

 A bug in BlocksMap that  cause NameNode  memory leak.
 -

 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
  Labels: MemoryLeak

 In our HDFS production environment, NameNode FGC frequently after running for 
 2 months, we have to restart NameNode manually.
 We dumped NameNode's Heap for objects statistics.
 Before restarting NameNode:
 num #instances #bytes class name
 --
     1: 59262275 3613989480 [Ljava.lang.Object;
     ...
     10: 8549361 615553992 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     11: 5941511 427788792 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 After restarting NameNode:
 num #instances #bytes class name
 --
      1: 44188391 2934099616 [Ljava.lang.Object;
   ...
     23: 721763 51966936 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     24: 620028 44642016 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 We find the number of BlockInfoUnderConstruction is abnormally large before 
 restarting NameNode.
 As we know, BlockInfoUnderConstruction keeps block state when the file is 
 being written. But the write pressure of
 our cluster is far less than million/sec. We think there is a memory leak in 
 NameNode.
 We fixed the bug as followsing patch.
 --- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (版本 
 1640066)
 +++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (工作副本)
 @@ -205,6 +205,8 @@
DatanodeDescriptor dn = currentBlock.getDatanode(idx);
dn.replaceBlock(currentBlock, newBlock);
  }
 +// change to fix bug about memory leak of NameNode
 +map.remove(newBlock);
  // replace block in the map itself
  map.put(newBlock, newBlock);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java

2015-01-07 Thread Ayappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayappan updated HDFS-4681:
--
Assignee: Suresh Srinivas  (was: Ayappan)

 TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails 
 using IBM java
 -

 Key: HDFS-4681
 URL: https://issues.apache.org/jira/browse/HDFS-4681
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.5.2
Reporter: Tian Hong Wang
Assignee: Suresh Srinivas
  Labels: patch
 Attachments: HDFS-4681-v1.patch, HDFS-4681.patch


 TestBlocksWithNotEnoughRacks unit test fails with the following error message:
 
 testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks)
   Time elapsed: 8997 sec   FAILURE!
 org.junit.ComparisonFailure: Corrupt replica 
 expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g
  �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���   
 oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
 ��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but 
 was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g
  �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���  
 oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�]
 at org.junit.Assert.assertEquals(Assert.java:123)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Target Version/s: 3.0.0  (was: 3.0.0, trunk-win)

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268768#comment-14268768
 ] 

Zhanwei Wang commented on HDFS-7589:


hi [~crisnack]

Would you please help me to verify if the target version is right. I's my first 
time to submit the patch.

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java

2015-01-07 Thread Ayappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268889#comment-14268889
 ] 

Ayappan commented on HDFS-4681:
---

Any update on this ?

 TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails 
 using IBM java
 -

 Key: HDFS-4681
 URL: https://issues.apache.org/jira/browse/HDFS-4681
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.5.0, 2.4.1, 2.5.1
Reporter: Tian Hong Wang
Assignee: Ayappan
  Labels: patch
 Fix For: 2.0.3-alpha, 2.4.1

 Attachments: HDFS-4681-v1.patch, HDFS-4681.patch


 TestBlocksWithNotEnoughRacks unit test fails with the following error message:
 
 testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks)
   Time elapsed: 8997 sec   FAILURE!
 org.junit.ComparisonFailure: Corrupt replica 
 expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g
  �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���   
 oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
 ��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but 
 was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g
  �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���  
 oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�]
 at org.junit.Assert.assertEquals(Assert.java:123)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268879#comment-14268879
 ] 

Hadoop QA commented on HDFS-7589:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690701/HDFS-7589.002.patch
  against trunk revision ef237bd.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9152//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9152//console

This message is automatically generated.

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268880#comment-14268880
 ] 

Zhanwei Wang commented on HDFS-7589:


Hi [~crisnack] and [~cnauroth]

Sorry, I make mistake with your names.

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java

2015-01-07 Thread Ayappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayappan updated HDFS-4681:
--
Fix Version/s: (was: 2.4.1)
   (was: 2.0.3-alpha)

 TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails 
 using IBM java
 -

 Key: HDFS-4681
 URL: https://issues.apache.org/jira/browse/HDFS-4681
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.5.0, 2.4.1, 2.5.1
Reporter: Tian Hong Wang
Assignee: Ayappan
  Labels: patch
 Attachments: HDFS-4681-v1.patch, HDFS-4681.patch


 TestBlocksWithNotEnoughRacks unit test fails with the following error message:
 
 testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks)
   Time elapsed: 8997 sec   FAILURE!
 org.junit.ComparisonFailure: Corrupt replica 
 expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g
  �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���   
 oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
 ��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but 
 was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O��׼��Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g
  �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc���  
 oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4
��6b�S�G�^?��m4FW#^@
 D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�]
 at org.junit.Assert.assertEquals(Assert.java:123)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7592) A bug in BlocksMap that cause NameNode memory leak.

2015-01-07 Thread JichengSong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JichengSong updated HDFS-7592:
--
Labels: BlocksMap leak memory  (was: BlocksMap MemoryLeak)

 A bug in BlocksMap that  cause NameNode  memory leak.
 -

 Key: HDFS-7592
 URL: https://issues.apache.org/jira/browse/HDFS-7592
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
  Labels: BlocksMap, leak, memory

 In our HDFS production environment, NameNode FGC frequently after running for 
 2 months, we have to restart NameNode manually.
 We dumped NameNode's Heap for objects statistics.
 Before restarting NameNode:
 num #instances #bytes class name
 --
     1: 59262275 3613989480 [Ljava.lang.Object;
     ...
     10: 8549361 615553992 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     11: 5941511 427788792 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 After restarting NameNode:
 num #instances #bytes class name
 --
      1: 44188391 2934099616 [Ljava.lang.Object;
   ...
     23: 721763 51966936 
 org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
     24: 620028 44642016 
 org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
 We find the number of BlockInfoUnderConstruction is abnormally large before 
 restarting NameNode.
 As we know, BlockInfoUnderConstruction keeps block state when the file is 
 being written. But the write pressure of
 our cluster is far less than million/sec. We think there is a memory leak in 
 NameNode.
 We fixed the bug as followsing patch.
 --- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (版本 
 1640066)
 +++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java  (工作副本)
 @@ -205,6 +205,8 @@
DatanodeDescriptor dn = currentBlock.getDatanode(idx);
dn.replaceBlock(currentBlock, newBlock);
  }
 +// change to fix bug about memory leak of NameNode
 +map.remove(newBlock);
  // replace block in the map itself
  map.put(newBlock, newBlock);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268439#comment-14268439
 ] 

Chris Nauroth commented on HDFS-7589:
-

Hello [~wangzw].  This looks mostly good.  Thank you for the patch.

There is a compilation error on Windows in test_libhdfs_zerocopy.c in 
{{nmdConfigureHdfsBuilder}}.  We have to stick to the C89 rules for declaring 
local variables at the top of the function, but {{domainSocket}} is declared 
midway through the function.  In this case, you can fix it by moving the line 
declaring and initializing {{domainSocket}} to the top like this:

{code}
int ret;
tPort port;
const char *domainSocket = hdfsGetDomainSocketPath(cl);
{code}

If you want to remove the dependency on hdfs.h, I don't think it would be a bad 
thing to duplicate the definition of {{EINTERNAL}} here.

This patch is not dependent on anything in the HDFS-6994 feature branch, so I 
think we can target this straight to trunk and branch-2.  That would reduce the 
size of the patch for reviewers to check later when it comes time for a merge 
of the feature branch back to trunk.  Let me know your thoughts.  If you agree, 
please click the Submit Patch button to trigger a Jenkins run.

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7590) Stabilize and document getBlockLocation API in WebHDFS

2015-01-07 Thread Jakob Homan (JIRA)
Jakob Homan created HDFS-7590:
-

 Summary: Stabilize and document getBlockLocation API in WebHDFS
 Key: HDFS-7590
 URL: https://issues.apache.org/jira/browse/HDFS-7590
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Jakob Homan


Currently the GET_BLOCK_LOCATIONS op is marked as private, unstable and is not 
documented in the WebHDFS web page.  The getBlockLocations is a public, stable 
API on FileSystem.  WebHDFS' GBL response is private-unstable because the API 
currently directly serializes out the LocatedBlocks instance and LocatedBlocks 
is private-unstable.  

A public-stable version of the response should be agreed upon and documented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.

2015-01-07 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-7561:
--
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thank you [~xieliang007].

 TestFetchImage should write fetched-image-dir under target.
 ---

 Key: HDFS-7561
 URL: https://issues.apache.org/jira/browse/HDFS-7561
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Konstantin Shvachko
Assignee: Liang Xie
 Fix For: 2.7.0

 Attachments: HDFS-7561-001.txt, HDFS-7561-002.txt


 {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, 
 which is then never cleaned up. The problem is that it uses build.test.dir 
 property, which seems to be invalid. Probably should use 
 {{MiniDFSCluster.getBaseDirectory()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7561) TestFetchImage should write fetched-image-dir under target.

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268557#comment-14268557
 ] 

Hudson commented on HDFS-7561:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6824 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6824/])
HDFS-7561. TestFetchImage should write fetched-image-dir under target. 
Contributed by Liang Xie. (shv: rev e86943fd6429be96913db4b61363faa66e95508c)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFetchImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 TestFetchImage should write fetched-image-dir under target.
 ---

 Key: HDFS-7561
 URL: https://issues.apache.org/jira/browse/HDFS-7561
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Konstantin Shvachko
Assignee: Liang Xie
 Fix For: 2.7.0

 Attachments: HDFS-7561-001.txt, HDFS-7561-002.txt


 {{TestFetchImage}} creates directory {{fetched-image-dir}} under hadoop-hdfs, 
 which is then never cleaned up. The problem is that it uses build.test.dir 
 property, which seems to be invalid. Probably should use 
 {{MiniDFSCluster.getBaseDirectory()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-1213.
-
Resolution: Not a Problem

There is now an HDFS provider implemented in the Apache Commons VFS tree:

http://svn.apache.org/viewvc/commons/proper/vfs/trunk/core/src/main/java/org/apache/commons/vfs2/provider/hdfs/

I believe that means this jira is no longer needed, so I'm going to resolve it. 
 (Please feel free to reopen if I misunderstood.)

 Implement an Apache Commons VFS Driver for HDFS
 ---

 Key: HDFS-1213
 URL: https://issues.apache.org/jira/browse/HDFS-1213
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Reporter: Michael D'Amour
 Attachments: HADOOP-HDFS-Apache-VFS.patch, 
 pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz, 
 pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar


 We have an open source ETL tool (Kettle) which uses VFS for many input/output 
 steps/jobs.  We would like to be able to read/write HDFS from Kettle using 
 VFS.  
  
 I haven't been able to find anything out there other than it would be nice.
  
 I had some time a few weeks ago to begin writing a VFS driver for HDFS and we 
 (Pentaho) would like to be able to contribute this driver.  I believe it 
 supports all the major file/folder operations and I have written unit tests 
 for all of these operations.  The code is currently checked into an open 
 Pentaho SVN repository under the Apache 2.0 license.  There are some current 
 limitations, such as a lack of authentication (kerberos), which appears to be 
 coming in 0.22.0, however, the driver supports username/password, but I just 
 can't use them yet.
 I will be attaching the code for the driver once the case is created.  The 
 project does not modify existing hadoop/hdfs source.
 Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-07 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268598#comment-14268598
 ] 

Benoy Antony commented on HDFS-7467:


{quote}
If a file does not satisfies the specified policy, fsck should show such 
information.
{quote}

That's good for a file. What do we do for directories which can potentially 
have children with different policies ? 

 Provide storage tier information for a directory via fsck
 -

 Key: HDFS-7467
 URL: https://issues.apache.org/jira/browse/HDFS-7467
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.6.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7467.patch


 Currently _fsck_  provides information regarding blocks for a directory.
 It should be augmented to provide storage tier information (optionally). 
 The sample report could be as follows :
 {code}
 Storage Tier Combination# of blocks   % of blocks
 DISK:1,ARCHIVE:2  340730   97.7393%
  
 ARCHIVE:3   39281.1268%
  
 DISK:2,ARCHIVE:231220.8956%
  
 DISK:2,ARCHIVE:1 7480.2146%
  
 DISK:1,ARCHIVE:3  440.0126%
  
 DISK:3,ARCHIVE:2  300.0086%
  
 DISK:3,ARCHIVE:1   90.0026%
 {code}
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3296) Running libhdfs tests in mac fails

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-3296:

 Component/s: libhdfs
Target Version/s: 2.7.0
Assignee: Chris Nauroth

 Running libhdfs tests in mac fails
 --

 Key: HDFS-3296
 URL: https://issues.apache.org/jira/browse/HDFS-3296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Reporter: Amareshwari Sriramadasu
Assignee: Chris Nauroth
 Attachments: HDFS-3296.001.patch


 Running ant -Dcompile.c++=true -Dlibhdfs=true test-c++-libhdfs on Mac fails 
 with following error:
 {noformat}
  [exec] dyld: lazy symbol binding failed: Symbol not found: 
 _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] dyld: Symbol not found: _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] 
 /Users/amareshwari.sr/workspace/hadoop/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 122: 39485 Trace/BPT trap: 5   CLASSPATH=$HADOOP_CONF_DIR:$CLASSPATH 
 LD_PRELOAD=$LIB_JVM_DIR/libjvm.so:$LIBHDFS_INSTALL_DIR/libhdfs.so: 
 $LIBHDFS_BUILD_DIR/$HDFS_TEST
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3296) Running libhdfs tests in mac fails

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-3296:

Attachment: HDFS-3296.001.patch

This is almost working in current trunk, as long as we set 
{{DYLD_LIBRARY_PATH}}.  (Patch attached.)  However, this then runs into a 
problem with test_libhdfs_zerocopy hanging.  There appears to be some 
difference in domain socket handling on Mac (or BSDs in general) that our code 
isn't handling well.  I'm still investigating.

 Running libhdfs tests in mac fails
 --

 Key: HDFS-3296
 URL: https://issues.apache.org/jira/browse/HDFS-3296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Reporter: Amareshwari Sriramadasu
 Attachments: HDFS-3296.001.patch


 Running ant -Dcompile.c++=true -Dlibhdfs=true test-c++-libhdfs on Mac fails 
 with following error:
 {noformat}
  [exec] dyld: lazy symbol binding failed: Symbol not found: 
 _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] dyld: Symbol not found: _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] 
 /Users/amareshwari.sr/workspace/hadoop/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 122: 39485 Trace/BPT trap: 5   CLASSPATH=$HADOOP_CONF_DIR:$CLASSPATH 
 LD_PRELOAD=$LIB_JVM_DIR/libjvm.so:$LIBHDFS_INSTALL_DIR/libhdfs.so: 
 $LIBHDFS_BUILD_DIR/$HDFS_TEST
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3296) Running libhdfs tests in mac fails

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-3296:

Status: Patch Available  (was: Open)

 Running libhdfs tests in mac fails
 --

 Key: HDFS-3296
 URL: https://issues.apache.org/jira/browse/HDFS-3296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Reporter: Amareshwari Sriramadasu
 Attachments: HDFS-3296.001.patch


 Running ant -Dcompile.c++=true -Dlibhdfs=true test-c++-libhdfs on Mac fails 
 with following error:
 {noformat}
  [exec] dyld: lazy symbol binding failed: Symbol not found: 
 _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] dyld: Symbol not found: _JNI_GetCreatedJavaVMs
  [exec]   Referenced from: 
 /Users/amareshwari.sr/workspace/hadoop/build/c++/Mac_OS_X-x86_64-64/lib/libhdfs.0.dylib
  [exec]   Expected in: flat namespace
  [exec] 
  [exec] 
 /Users/amareshwari.sr/workspace/hadoop/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 122: 39485 Trace/BPT trap: 5   CLASSPATH=$HADOOP_CONF_DIR:$CLASSPATH 
 LD_PRELOAD=$LIB_JVM_DIR/libjvm.so:$LIBHDFS_INSTALL_DIR/libhdfs.so: 
 $LIBHDFS_BUILD_DIR/$HDFS_TEST
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7591) hdfs classpath command should support same options as hadoop classpath.

2015-01-07 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-7591:
---

 Summary: hdfs classpath command should support same options as 
hadoop classpath.
 Key: HDFS-7591
 URL: https://issues.apache.org/jira/browse/HDFS-7591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7591) hdfs classpath command should support same options as hadoop classpath.

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-7591:

 Description: HADOOP-10903 enhanced the {{hadoop classpath}} command to 
support optional expansion of the wildcards and bundling the classpath into a 
jar file containing a manifest with the Class-Path attribute.  The other 
classpath commands should do the same for consistency.
Target Version/s: 3.0.0

 hdfs classpath command should support same options as hadoop classpath.
 ---

 Key: HDFS-7591
 URL: https://issues.apache.org/jira/browse/HDFS-7591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth

 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute.  The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7591) hdfs classpath command should support same options as hadoop classpath.

2015-01-07 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-7591:

Priority: Minor  (was: Major)

 hdfs classpath command should support same options as hadoop classpath.
 ---

 Key: HDFS-7591
 URL: https://issues.apache.org/jira/browse/HDFS-7591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Priority: Minor

 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute.  The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7591) hdfs classpath command should support same options as hadoop classpath.

2015-01-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268629#comment-14268629
 ] 

Chris Nauroth commented on HDFS-7591:
-

Thanks to [~aw] for reporting it to me.  This one is targeted to 3.0.0 only, 
because there is no {{hdfs classpath}} command in 2.x.

 hdfs classpath command should support same options as hadoop classpath.
 ---

 Key: HDFS-7591
 URL: https://issues.apache.org/jira/browse/HDFS-7591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth

 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute.  The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7553) fix the TestDFSUpgradeWithHA due to BindException

2015-01-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268218#comment-14268218
 ] 

Chris Nauroth commented on HDFS-7553:
-

Hi [~xieliang007].  If I understand correctly, you were trying to reset the 
port to 0, which would then retrigger the logic for dynamically selecting a new 
bind port.  However, I don't think this is right, because all of this is 
happening in a {{Before}} method before each test runs.  The failure report 
shows the bind failure part-way through one test after a NameNode restart, so 
it doesn't look like an interaction spanning multiple tests that run into a 
port conflict with each other.

Another possibility here might be that NameNode shutdown waits for completion 
of the RPC server shutdown, but not the HTTP server shutdown.  This happens in 
{{NameNode#join}}:

{code}
  public void join() {
try {
  rpcServer.join();
} catch (InterruptedException ie) {
  LOG.info(Caught interrupted exception , ie);
}
  }
{code}

However, there is no call to {{HttpServer2#join}}.  This would ultimately block 
waiting for Jetty's thread pool to shut down.  Perhaps we have a race condition 
where Jetty remains running bound to that port for a brief window after 
NameNode shutdown.  Then, when the restart happens, we have a conflict on that 
port.

 fix the TestDFSUpgradeWithHA due to BindException
 -

 Key: HDFS-7553
 URL: https://issues.apache.org/jira/browse/HDFS-7553
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 2.7.0
Reporter: Liang Xie
Assignee: Liang Xie
 Attachments: HDFS-7553-001.txt


 see 
 https://builds.apache.org/job/PreCommit-HDFS-Build/9092//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestDFSUpgradeWithHA/testNfsUpgrade/
  :
 Error Message
 Port in use: localhost:57896
 Stacktrace
 java.net.BindException: Port in use: localhost:57896
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:444)
   at sun.nio.ch.Net.bind(Net.java:436)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
   at 
 org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:868)
   at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:809)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:704)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:591)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:763)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:747)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1443)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1815)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1796)
   at 
 org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA.testNfsUpgrade(TestDFSUpgradeWithHA.java:285)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5631) Expose interfaces required by FsDatasetSpi implementations

2015-01-07 Thread Joe Pallas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268147#comment-14268147
 ] 

Joe Pallas commented on HDFS-5631:
--

Yes, the ExternalDatasetImpl would need to be updated for changes to the SPI, 
just as SimulatedFSDataset needs to be updated.

Why, then is ExternalDatasetImpl even necessary?  Because the goal of 
ExternalDatasetImpl is to guarantee that all the classes necessary to implement 
the interface are (and remain) publicly accessible, and SimulatedFSDataset 
doesn't do that (because it resides in the same package as the interface, 
org.apache.hadoop.hdfs.server.datanode).

Also, the test classes make sure that constructors for the necessary classes 
are visible, which SimulatedFSDataset would not do even if it were moved into a 
different package.

There's some burden placed on people who want to make changes to the SPI, but 
that's probably a good thing.  Changes to the SPI should not be made casually, 
because they may have impact outside of HDFS.


 Expose interfaces required by FsDatasetSpi implementations
 --

 Key: HDFS-5631
 URL: https://issues.apache.org/jira/browse/HDFS-5631
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 3.0.0
Reporter: David Powell
Assignee: David Powell
Priority: Minor
 Attachments: HDFS-5631.patch, HDFS-5631.patch


 This sub-task addresses section 4.1 of the document attached to HDFS-5194,
 the exposure of interfaces needed by a FsDatasetSpi implementation.
 Specifically it makes ChunkChecksum public and BlockMetadataHeader's
 readHeader() and writeHeader() methods public.
 The changes to BlockReaderUtil (and related classes) discussed by section
 4.1 are only needed if supporting short-circuit, and should be addressed
 as part of an effort to provide such support rather than this JIRA.
 To help ensure these changes are complete and are not regressed in the
 future, tests that gauge the accessibility (though *not* behavior)
 of interfaces needed by a FsDatasetSpi subclass are also included.
 These take the form of a dummy FsDatasetSpi subclass -- a successful
 compilation is effectively a pass.  Trivial unit tests are included so
 that there is something tangible to track.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267709#comment-14267709
 ] 

Hudson commented on HDFS-7564:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #67 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/67/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-01-07 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7411:
--
Attachment: hdfs-7411.006.patch

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-01-07 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268306#comment-14268306
 ] 

Andrew Wang commented on HDFS-7411:
---

Thanks for the reviews Colin and Ming, new patch up.

* Rebased on trunk
* Updated for new ChunkedArrayList with a working size function
* Updated the logging calls to use new log level helper methods, consolidated 
some duplicated logging
* Did the rename and spacing fixes that Colin recommended. Log level change was 
necessary since slf4j doesn't have fatal.
* Ming, agree with everything you pointed out. The blocks limit enforcement is 
intentionally inexact; I played with iterating based on both DN+block rather 
than just DN, but it seemed more complex for a small gain.

As to the limit configuration, I didn't touch it for now until we agree on a 
solution. The idea here was to be compatible with the old config option, yet 
also provide a way of migrating to the new one. Ming's proposal seems 
reasonable, but the override makes configuration more complex. I feel that 
computing the limit based on runtime information could also lead to surprises.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268329#comment-14268329
 ] 

Chris Nauroth commented on HDFS-7579:
-

Hi [~clamb].  This is a good idea.  Thanks for the patch.

One difference I see in this patch is that the logged value of 
{{numReportsSent}} is set to the length of the per-volume report list on both 
the single message and split RPC per report code paths.  Previously, this 
would have been 1 for the single message case.  I think the intent of this 
counter was to give an indication of the actual number of RPCs, so we'd want to 
keep 1 for the single message case.  Cc'ing [~arpitagarwal] for a second 
opinion.

 Improve log reporting during block report rpc failure
 -

 Key: HDFS-7579
 URL: https://issues.apache.org/jira/browse/HDFS-7579
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
  Labels: supportability
 Attachments: HDFS-7579.000.patch


 During block reporting, if the block report RPC fails, for example because it 
 exceeded the max rpc len, we should still produce some sort of LOG.info 
 output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7548) Corrupt block reporting delayed until datablock scanner thread detects it

2015-01-07 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268347#comment-14268347
 ] 

Nathan Roberts commented on HDFS-7548:
--

I think we need to handle the java.io.IOException: Input/output error case as 
well since this is what we'll see if having trouble reading from disk.

 Corrupt block reporting delayed until datablock scanner thread detects it
 -

 Key: HDFS-7548
 URL: https://issues.apache.org/jira/browse/HDFS-7548
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: HDFS-7548.patch


 When there is one datanode holding the block and that block happened to be
 corrupt, namenode would keep on trying to replicate the block repeatedly but 
 it would only report the block as corrupt only when the data block scanner 
 thread of the datanode picks up this bad block.
 Requesting improvement in namenode reporting so that corrupt replica would be 
 reported when there is only 1 replica and the replication of that replica 
 keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7579) Improve log reporting during block report rpc failure

2015-01-07 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268364#comment-14268364
 ] 

Arpit Agarwal commented on HDFS-7579:
-

Thank you for the heads up Chris.

I agree that numReportsSent should be the number of RPCs and not 
reports.length. We could rename it to be clearer.

Couple of additional comments:
# _final int nCmds = cmds.size();_ should check for {{cmds}} being null.
# The logged message should give some indication whether or not all RPCs 
succeeded. Perhaps we can log something like _successfully completed x of y RPC 
calls_. For the usual case (no split), y would be 1. For the split case it 
would be {{reports.length}}.

 Improve log reporting during block report rpc failure
 -

 Key: HDFS-7579
 URL: https://issues.apache.org/jira/browse/HDFS-7579
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
  Labels: supportability
 Attachments: HDFS-7579.000.patch


 During block reporting, if the block report RPC fails, for example because it 
 exceeded the max rpc len, we should still produce some sort of LOG.info 
 output to help with debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck

2015-01-07 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268661#comment-14268661
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7467:
---

I think the fsck output can have two sections, one for the blocks that 
satisfying the specified policy and one for the others.  E.g.

{noformat}
Blocks satisfy the specified policy:
Storage Policy   # of blocks   % of blocks
hot(DISK:3) 340730   97.7393%
hot(DISK:4)   39281.1268%
frozen(ARCHIVE:3) 31220.8956%

Blocks DO NOT satisfy the specified policy:
Storage Policy   # of blocks   % of blocks
DISK:3  440.0126%
DISK:1,ARCHIVE:2300.0086%
ARCHIVE:390.0026%
{noformat}



 Provide storage tier information for a directory via fsck
 -

 Key: HDFS-7467
 URL: https://issues.apache.org/jira/browse/HDFS-7467
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer  mover
Affects Versions: 2.6.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7467.patch


 Currently _fsck_  provides information regarding blocks for a directory.
 It should be augmented to provide storage tier information (optionally). 
 The sample report could be as follows :
 {code}
 Storage Tier Combination# of blocks   % of blocks
 DISK:1,ARCHIVE:2  340730   97.7393%
  
 ARCHIVE:3   39281.1268%
  
 DISK:2,ARCHIVE:231220.8956%
  
 DISK:2,ARCHIVE:1 7480.2146%
  
 DISK:1,ARCHIVE:3  440.0126%
  
 DISK:3,ARCHIVE:2  300.0086%
  
 DISK:3,ARCHIVE:1   90.0026%
 {code}
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS

2015-01-07 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268685#comment-14268685
 ] 

Dave Marion commented on HDFS-1213:
---

FWIW, the current HDFS provider in Commons VFS is read-only.

 Implement an Apache Commons VFS Driver for HDFS
 ---

 Key: HDFS-1213
 URL: https://issues.apache.org/jira/browse/HDFS-1213
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Reporter: Michael D'Amour
 Attachments: HADOOP-HDFS-Apache-VFS.patch, 
 pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz, 
 pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar


 We have an open source ETL tool (Kettle) which uses VFS for many input/output 
 steps/jobs.  We would like to be able to read/write HDFS from Kettle using 
 VFS.  
  
 I haven't been able to find anything out there other than it would be nice.
  
 I had some time a few weeks ago to begin writing a VFS driver for HDFS and we 
 (Pentaho) would like to be able to contribute this driver.  I believe it 
 supports all the major file/folder operations and I have written unit tests 
 for all of these operations.  The code is currently checked into an open 
 Pentaho SVN repository under the Apache 2.0 license.  There are some current 
 limitations, such as a lack of authentication (kerberos), which appears to be 
 coming in 0.22.0, however, the driver supports username/password, but I just 
 can't use them yet.
 I will be attaching the code for the driver once the case is created.  The 
 project does not modify existing hadoop/hdfs source.
 Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2015-01-07 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268692#comment-14268692
 ] 

Konstantin Shvachko commented on HDFS-7056:
---

The latest build was actually successful.
- Findbugs warning is fixed by HDFS-7583
- TestDecommission.testIncludeByRegistrationName() failure is reported in 
HDFS-7083

Are there any unresolved issues remaining? [~jingzhao], [~cmccabe], everybody. 
If not could you please cast your votes.
The patches are still relevant for trunk.

 Snapshot support for truncate
 -

 Key: HDFS-7056
 URL: https://issues.apache.org/jira/browse/HDFS-7056
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HDFS-3107-HDFS-7056-combined-13.patch, 
 HDFS-3107-HDFS-7056-combined-15.patch, HDFS-3107-HDFS-7056-combined.patch, 
 HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
 HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
 HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
 HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
 HDFS-7056-13.patch, HDFS-7056-15.patch, HDFS-7056.patch, HDFS-7056.patch, 
 HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
 HDFS-7056.patch, HDFS-7056.patch, HDFSSnapshotWithTruncateDesign.docx


 Implementation of truncate in HDFS-3107 does not allow truncating files which 
 are in a snapshot. It is desirable to be able to truncate and still keep the 
 old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7589:
---
Attachment: HDFS-7589.002.patch

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7589) Break the dependency between libnative_mini_dfs and libhdfs

2015-01-07 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268751#comment-14268751
 ] 

Zhanwei Wang commented on HDFS-7589:


Hi [~crisnack]

In new patch, I moved the declaration of domainSocket to top of the function, 
and duplicate the definition of EINTERNAL.

Thanks for you review.

 Break the dependency between libnative_mini_dfs and libhdfs
 ---

 Key: HDFS-7589
 URL: https://issues.apache.org/jira/browse/HDFS-7589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Zhanwei Wang
Assignee: Zhanwei Wang
 Attachments: HDFS-7589.002.patch, HDFS-7589.patch


 Currently libnative_mini_dfs links with libhdfs to reuse some common code. 
 Other applications which want to use libnative_mini_dfs have to link to 
 libhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267745#comment-14267745
 ] 

Hudson commented on HDFS-7564:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2017 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2017/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7453) Namenode does not recognize block is missing on a datanode

2015-01-07 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267992#comment-14267992
 ] 

Chris Nauroth commented on HDFS-7453:
-

Hi [~dmitrybugaychenko].  Thank you for the report.

I noticed that the replication factor is set to 2 in the example that you gave. 
 Is this the replication factor for all files that experienced this problem?

I haven't seen anything like this.  When we lose a replica from a DataNode 
going down, we expect the NameNode to schedule a re-replication to another 
healthy DataNode.  However, replication factor is most often set to 3.  I'm 
wondering if there might be a bug that only gets triggered with replication 
factor set to 2.

 Namenode does not recognize block is missing on a datanode
 --

 Key: HDFS-7453
 URL: https://issues.apache.org/jira/browse/HDFS-7453
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: Dmitry Bugaychenko
Priority: Critical

 We got a severe datalose due to the fact that namenode didn't recognized 
 block it thinks exist are not actually exist on the datanodes. The senario is 
 something as follows:
 # A disk fails on one of the datanodes
 # The datanode is forced to decommiss and shut down
 # The disk is replaced and datanode is back again
 # fsck shows everything is fine
 # Repeat 1-4 for few weeks
 # Restart the namenode
 # It suddenly sees tens of thousands under-replicated blocks and hundreds 
 missing blocks
 During the next disk failure we analysed situation a bit more and found 
 particular block on particular datanode that is missing: there is no file for 
 block and if we try to read it, we got
 {code}
 java.io.IOException: Got error for OP_READ_BLOCK, self=/XXX:33817, 
 remote=XXX/X.X.X.X:50010, for file 
 XXX/X.X.X.X:50010:BP-879324367-YYY-1404837025894:1083356878, for pool 
 BP-879324367-YYY-1404837025894 block 1083356878_9644290
 {code}
 We restarted the datanode and in the log we can see that it did scan all the 
 directories and send the report to namenode:
 {code}
 2014-11-27 17:06:34,174  INFO [DataNode: 
 [[[DISK]file:/mnt/hadoop/0/dfs/data/, [DISK]file:/mnt/hadoop/1/dfs/data/, 
 [DISK]file:/mnt/hadoop/2/dfs/data/]]  heartbeating to /YYY:8020] 
 FsDatasetImpl - Adding block pool BP-879324367-YYY-1404837025894
 2014-11-27 17:06:34,175  INFO [Thread-41] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/0/dfs/data/current...
 2014-11-27 17:06:34,176  INFO [Thread-43] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/2/dfs/data/current...
 2014-11-27 17:06:34,176  INFO [Thread-42] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/1/dfs/data/current...
 2014-11-27 17:06:34,279  INFO [Thread-42] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/1/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 62677866794
 2014-11-27 17:06:34,282  INFO [Thread-42] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/1/dfs/data/current: 
 105ms
 2014-11-27 17:06:34,744  INFO [Thread-41] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/0/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 2465590681232
 2014-11-27 17:06:34,744  INFO [Thread-41] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/0/dfs/data/current: 
 568ms
 2014-11-27 17:06:34,856  INFO [Thread-43] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/2/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 2475580099468
 2014-11-27 17:06:34,857  INFO [Thread-43] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/2/dfs/data/current: 
 680ms
 2014-11-27 17:06:34,857  INFO [DataNode: 
 [[[DISK]file:/mnt/hadoop/0/dfs/data/, [DISK]file:/mnt/hadoop/1/dfs/data/, 
 [DISK]file:/mnt/hadoop/2/dfs/data/]]  heartbeating to /YYY:8020] 
 2014-11-27 17:06:34,858  INFO [Thread-44] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/0/dfs/data/current...
 2014-11-27 17:06:34,890  INFO [Thread-46] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/2/dfs/data/current...
 2014-11-27 17:06:34,890  INFO [Thread-45] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/1/dfs/data/current...
 2014-11-27 17:06:34,961  INFO [Thread-45] FsDatasetImpl - Time to add 
 replicas to map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/1/dfs/data/current: 70ms
 2014-11-27 17:06:36,083  INFO [Thread-44] FsDatasetImpl - Time to add 
 replicas to 

[jira] [Commented] (HDFS-7453) Namenode does not recognize block is missing on a datanode

2015-01-07 Thread Jasper Hafkenscheid (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267999#comment-14267999
 ] 

Jasper Hafkenscheid commented on HDFS-7453:
---

Hi,

We have a cluster with bad disk(controler)s, and see this problem often.
Our replication facor is 3.
Luckely most disks recover after a powercycle.

Regards Jasper



 Namenode does not recognize block is missing on a datanode
 --

 Key: HDFS-7453
 URL: https://issues.apache.org/jira/browse/HDFS-7453
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: Dmitry Bugaychenko
Priority: Critical

 We got a severe datalose due to the fact that namenode didn't recognized 
 block it thinks exist are not actually exist on the datanodes. The senario is 
 something as follows:
 # A disk fails on one of the datanodes
 # The datanode is forced to decommiss and shut down
 # The disk is replaced and datanode is back again
 # fsck shows everything is fine
 # Repeat 1-4 for few weeks
 # Restart the namenode
 # It suddenly sees tens of thousands under-replicated blocks and hundreds 
 missing blocks
 During the next disk failure we analysed situation a bit more and found 
 particular block on particular datanode that is missing: there is no file for 
 block and if we try to read it, we got
 {code}
 java.io.IOException: Got error for OP_READ_BLOCK, self=/XXX:33817, 
 remote=XXX/X.X.X.X:50010, for file 
 XXX/X.X.X.X:50010:BP-879324367-YYY-1404837025894:1083356878, for pool 
 BP-879324367-YYY-1404837025894 block 1083356878_9644290
 {code}
 We restarted the datanode and in the log we can see that it did scan all the 
 directories and send the report to namenode:
 {code}
 2014-11-27 17:06:34,174  INFO [DataNode: 
 [[[DISK]file:/mnt/hadoop/0/dfs/data/, [DISK]file:/mnt/hadoop/1/dfs/data/, 
 [DISK]file:/mnt/hadoop/2/dfs/data/]]  heartbeating to /YYY:8020] 
 FsDatasetImpl - Adding block pool BP-879324367-YYY-1404837025894
 2014-11-27 17:06:34,175  INFO [Thread-41] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/0/dfs/data/current...
 2014-11-27 17:06:34,176  INFO [Thread-43] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/2/dfs/data/current...
 2014-11-27 17:06:34,176  INFO [Thread-42] FsDatasetImpl - Scanning block pool 
 BP-879324367-YYY-1404837025894 on volume /mnt/hadoop/1/dfs/data/current...
 2014-11-27 17:06:34,279  INFO [Thread-42] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/1/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 62677866794
 2014-11-27 17:06:34,282  INFO [Thread-42] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/1/dfs/data/current: 
 105ms
 2014-11-27 17:06:34,744  INFO [Thread-41] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/0/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 2465590681232
 2014-11-27 17:06:34,744  INFO [Thread-41] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/0/dfs/data/current: 
 568ms
 2014-11-27 17:06:34,856  INFO [Thread-43] FsDatasetImpl - Cached dfsUsed 
 found for 
 /mnt/hadoop/2/dfs/data/current/BP-879324367-YYY-1404837025894/current: 
 2475580099468
 2014-11-27 17:06:34,857  INFO [Thread-43] FsDatasetImpl - Time taken to scan 
 block pool BP-879324367-YYY-1404837025894 on /mnt/hadoop/2/dfs/data/current: 
 680ms
 2014-11-27 17:06:34,857  INFO [DataNode: 
 [[[DISK]file:/mnt/hadoop/0/dfs/data/, [DISK]file:/mnt/hadoop/1/dfs/data/, 
 [DISK]file:/mnt/hadoop/2/dfs/data/]]  heartbeating to /YYY:8020] 
 2014-11-27 17:06:34,858  INFO [Thread-44] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/0/dfs/data/current...
 2014-11-27 17:06:34,890  INFO [Thread-46] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/2/dfs/data/current...
 2014-11-27 17:06:34,890  INFO [Thread-45] FsDatasetImpl - Adding replicas to 
 map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/1/dfs/data/current...
 2014-11-27 17:06:34,961  INFO [Thread-45] FsDatasetImpl - Time to add 
 replicas to map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/1/dfs/data/current: 70ms
 2014-11-27 17:06:36,083  INFO [Thread-44] FsDatasetImpl - Time to add 
 replicas to map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/0/dfs/data/current: 1193ms
 2014-11-27 17:06:36,162  INFO [Thread-46] FsDatasetImpl - Time to add 
 replicas to map for block pool BP-879324367-YYY-1404837025894 on volume 
 /mnt/hadoop/2/dfs/data/current: 1271ms
 2014-11-27 17:06:36,162  INFO [DataNode: 
 

[jira] [Commented] (HDFS-7564) NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map

2015-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267681#comment-14267681
 ] 

Hudson commented on HDFS-7564:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #63 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/63/])
HDFS-7564. NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map. 
Contributed by Yongjun Zhang (brandonli: rev 
788ee35e2bf0f3d445e03e6ea9bd02c40c8fdfe3)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestShellBasedIdMapping.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedIdMapping.java


 NFS gateway dynamically reload UID/GID mapping file /etc/nfs.map
 

 Key: HDFS-7564
 URL: https://issues.apache.org/jira/browse/HDFS-7564
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Assignee: Yongjun Zhang
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-7564.001.patch, HDFS-7564.002.patch, 
 HDFS-7564.003.patch


 Add dynamic reload of the NFS gateway UID/GID mappings file /etc/nfs.map 
 (default for static.id.mapping.file).
 It seems that the mappings file is currently only read upon restart of the 
 NFS gateway which would cause any active clients NFS mount points to hang or 
 fail.
 Regards,
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1195) Offer rate limits for replicating data

2015-01-07 Thread Cosmin Lehene (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267740#comment-14267740
 ] 

Cosmin Lehene commented on HDFS-1195:
-

[~kevinweil] is this still valid?

 Offer rate limits for replicating data 
 ---

 Key: HDFS-1195
 URL: https://issues.apache.org/jira/browse/HDFS-1195
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 0.20.2
 Environment: Linux, Hadoop 0.20.1 CDH
Reporter: Kevin Weil

 If a rack of Hadoop nodes goes down, there is a lot of data to re-replicate.  
 It would be great to have a configuration option to rate-limit the amount of 
 bandwidth used for re-replication so as not to saturate network backlinks.  
 There is a similar option for rate limiting the speed at which a DFS 
 rebalance takes place: dfs.balance.bandwidthPerSec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

2015-01-07 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267925#comment-14267925
 ] 

Dave Marion commented on HDFS-1312:
---

Note that as of 2.6.0 the layout has changed on the DataNode. Please see [1] 
for more information. I don't know if the tools mentioned above will work with 
these new restrictions.

[1] 
https://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F


 Re-balance disks within a Datanode
 --

 Key: HDFS-1312
 URL: https://issues.apache.org/jira/browse/HDFS-1312
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Reporter: Travis Crawford

 Filing this issue in response to ``full disk woes`` on hdfs-user.
 Datanodes fill their storage directories unevenly, leading to situations 
 where certain disks are full while others are significantly less used. Users 
 at many different sites have experienced this issue, and HDFS administrators 
 are taking steps like:
 - Manually rebalancing blocks in storage directories
 - Decomissioning nodes  later readding them
 There's a tradeoff between making use of all available spindles, and filling 
 disks at the sameish rate. Possible solutions include:
 - Weighting less-used disks heavier when placing new blocks on the datanode. 
 In write-heavy environments this will still make use of all spindles, 
 equalizing disk use over time.
 - Rebalancing blocks locally. This would help equalize disk use as disks are 
 added/replaced in older cluster nodes.
 Datanodes should actively manage their local disk so operator intervention is 
 not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)