[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365283#comment-14365283 ] Hudson commented on HDFS-5356: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) HDFS-5356. MiniDFSCluster should close all open FileSystems when shutdown() (Contributed by Rakesh R) (vinayakumarb: rev 018893e81ec1c43e6c79c77adec92c2edfb20cab) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365284#comment-14365284 ] Hudson commented on HDFS-7838: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) HDFS-7838. Expose truncate API for libhdfs. (yliu) (yliu: rev 48c2db34eff376c0f3a72587a5540b1e3dffafd2) * hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/hdfs_web.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c Expose truncate API for libhdfs --- Key: HDFS-7838 URL: https://issues.apache.org/jira/browse/HDFS-7838 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch, HDFS-7838.003.patch, HDFS-7838.004.patch It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365276#comment-14365276 ] Hudson commented on HDFS-7886: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) HDFS-7886. Fix TestFileTruncate falures. Contributed by Plamen Jeliazkov and Konstantin Shvachko. (shv: rev ce5de93a5837e115e1f0b7d3c5a67ace25385a63) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes Key: HDFS-7886 URL: https://issues.apache.org/jira/browse/HDFS-7886 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Plamen Jeliazkov Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7886-01.patch, HDFS-7886-02.patch, HDFS-7886-branch2.patch, HDFS-7886.patch https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7941) SequenceFile.Writer.hsync() not working?
Sverre Bakke created HDFS-7941: -- Summary: SequenceFile.Writer.hsync() not working? Key: HDFS-7941 URL: https://issues.apache.org/jira/browse/HDFS-7941 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Environment: HDP 2.2 running on Redhat Reporter: Sverre Bakke Priority: Minor When using SequenceFile.Writer and appending+syncing to file repeatedly, the sync does not appear to work other than: - once after writing headers - when closing. Imagine the following test case: http://pastebin.com/Y9xysCRX This code would append a new record every second and then immediately sync it. One would also imagine that the file would grow for every append, however, this does not happen. After watching the behavior I have noticed that it only syncs the headers at the very beginning (providing a file of 164 bytes) and then never again until its closed. This despite it is asked to hsync() after every append. Looking into the debug logs, this also claims the same behavior (executed the provided code example and grepped for sync): SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 2015-03-17 15:55:14 DEBUG ProtobufRpcEngine:253 - Call: fsync took 11ms This was the only time the code ran fsync throughout the entire execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7941) SequenceFile.Writer.hsync() not working?
[ https://issues.apache.org/jira/browse/HDFS-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sverre Bakke updated HDFS-7941: --- Priority: Major (was: Minor) SequenceFile.Writer.hsync() not working? Key: HDFS-7941 URL: https://issues.apache.org/jira/browse/HDFS-7941 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Environment: HDP 2.2 running on Redhat Reporter: Sverre Bakke When using SequenceFile.Writer and appending+syncing to file repeatedly, the sync does not appear to work other than: - once after writing headers - when closing. Imagine the following test case: http://pastebin.com/Y9xysCRX This code would append a new record every second and then immediately sync it. One would also imagine that the file would grow for every append, however, this does not happen. After watching the behavior I have noticed that it only syncs the headers at the very beginning (providing a file of 164 bytes) and then never again until its closed. This despite it is asked to hsync() after every append. Looking into the debug logs, this also claims the same behavior (executed the provided code example and grepped for sync): SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 2015-03-17 15:55:14 DEBUG ProtobufRpcEngine:253 - Call: fsync took 11ms This was the only time the code ran fsync throughout the entire execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365327#comment-14365327 ] Hudson commented on HDFS-7886: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) HDFS-7886. Fix TestFileTruncate falures. Contributed by Plamen Jeliazkov and Konstantin Shvachko. (shv: rev ce5de93a5837e115e1f0b7d3c5a67ace25385a63) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes Key: HDFS-7886 URL: https://issues.apache.org/jira/browse/HDFS-7886 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Plamen Jeliazkov Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7886-01.patch, HDFS-7886-02.patch, HDFS-7886-branch2.patch, HDFS-7886.patch https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2360) Ugly stacktrace when quota exceeds
[ https://issues.apache.org/jira/browse/HDFS-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365294#comment-14365294 ] Hudson commented on HDFS-2360: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) HDFS-2360. Ugly stacktrce when quota exceeds. (harsh) (harsh: rev 046521cd6511b7fc6d9478cb2bed90d8e75fca20) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Ugly stacktrace when quota exceeds -- Key: HDFS-2360 URL: https://issues.apache.org/jira/browse/HDFS-2360 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 0.23.0 Reporter: Rajit Saha Assignee: Harsh J Priority: Minor Fix For: 2.8.0 Attachments: HDFS-2360.patch, HDFS-2360.patch Will it be better to catch the exception and throw a small reasonable messege to user when they exceed quota? $hdfs dfs -mkdir testDir $hdfs dfsadmin -setSpaceQuota 191M testDir $hdfs dfs -count -q testDir none inf 200278016 2002780161 0 0 hdfs://NN hostname:port/user/hdfsqa/testDir $hdfs dfs -put /etc/passwd /user/hadoopqa/testDir 11/09/19 08:08:15 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:365) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1100) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:972) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) Caused by: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at
[jira] [Commented] (HDFS-6841) Use Time.monotonicNow() wherever applicable instead of Time.now()
[ https://issues.apache.org/jira/browse/HDFS-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365300#comment-14365300 ] Kihwal Lee commented on HDFS-6841: -- We have seen multiple occasions of dead datanode coming back and trying to resend an IBR every 1 second. Normally the offerService loop makes it hreatbeat and re-register, but these nodes were skipping heartbeat and kept re-sending IBR. I got a hold of one such node and got a heap dump to see what's going on. {noformat} lastBlockReport (J) : 1426164231598 lastCacheReport (J) : 0 lastDeletedReport (J) : 1426206179858 lastHeartbeat (J) : 2693080084078 {noformat} Since {{lastHeartbeat}} is huge, it wasn't sending heartbeat. We should move forward with this. Use Time.monotonicNow() wherever applicable instead of Time.now() - Key: HDFS-6841 URL: https://issues.apache.org/jira/browse/HDFS-6841 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6841-001.patch, HDFS-6841-002.patch, HDFS-6841-003.patch, HDFS-6841-004.patch {{Time.now()}} used in many places to calculate elapsed time. This should be replaced with {{Time.monotonicNow()}} to avoid effect of System time changes on elapsed time calculations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-6841) Use Time.monotonicNow() wherever applicable instead of Time.now()
[ https://issues.apache.org/jira/browse/HDFS-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365300#comment-14365300 ] Kihwal Lee edited comment on HDFS-6841 at 3/17/15 3:13 PM: --- We have seen multiple occasions of dead datanode coming back and trying to resend an IBR every 1 second. Normally the offerService loop makes it hreatbeat and re-register, but these nodes were skipping heartbeat and kept re-sending IBR. I got a hold of one such node and got a heap dump to see what's going on. {noformat} lastBlockReport (J) : 1426164231598 lastCacheReport (J) : 0 lastDeletedReport (J) : 1426206179858 lastHeartbeat (J) : 2693080084078 {noformat} Since {{lastHeartbeat}} was huge, it wasn't sending heartbeat. We should move forward with this. was (Author: kihwal): We have seen multiple occasions of dead datanode coming back and trying to resend an IBR every 1 second. Normally the offerService loop makes it hreatbeat and re-register, but these nodes were skipping heartbeat and kept re-sending IBR. I got a hold of one such node and got a heap dump to see what's going on. {noformat} lastBlockReport (J) : 1426164231598 lastCacheReport (J) : 0 lastDeletedReport (J) : 1426206179858 lastHeartbeat (J) : 2693080084078 {noformat} Since {{lastHeartbeat}} is huge, it wasn't sending heartbeat. We should move forward with this. Use Time.monotonicNow() wherever applicable instead of Time.now() - Key: HDFS-6841 URL: https://issues.apache.org/jira/browse/HDFS-6841 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6841-001.patch, HDFS-6841-002.patch, HDFS-6841-003.patch, HDFS-6841-004.patch {{Time.now()}} used in many places to calculate elapsed time. This should be replaced with {{Time.monotonicNow()}} to avoid effect of System time changes on elapsed time calculations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6841) Use Time.monotonicNow() wherever applicable instead of Time.now()
[ https://issues.apache.org/jira/browse/HDFS-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365308#comment-14365308 ] Hadoop QA commented on HDFS-6841: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669910/HDFS-6841-004.patch against trunk revision 7179f94. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9922//console This message is automatically generated. Use Time.monotonicNow() wherever applicable instead of Time.now() - Key: HDFS-6841 URL: https://issues.apache.org/jira/browse/HDFS-6841 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-6841-001.patch, HDFS-6841-002.patch, HDFS-6841-003.patch, HDFS-6841-004.patch {{Time.now()}} used in many places to calculate elapsed time. This should be replaced with {{Time.monotonicNow()}} to avoid effect of System time changes on elapsed time calculations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2360) Ugly stacktrace when quota exceeds
[ https://issues.apache.org/jira/browse/HDFS-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365346#comment-14365346 ] Hudson commented on HDFS-2360: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) HDFS-2360. Ugly stacktrce when quota exceeds. (harsh) (harsh: rev 046521cd6511b7fc6d9478cb2bed90d8e75fca20) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java Ugly stacktrace when quota exceeds -- Key: HDFS-2360 URL: https://issues.apache.org/jira/browse/HDFS-2360 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 0.23.0 Reporter: Rajit Saha Assignee: Harsh J Priority: Minor Fix For: 2.8.0 Attachments: HDFS-2360.patch, HDFS-2360.patch Will it be better to catch the exception and throw a small reasonable messege to user when they exceed quota? $hdfs dfs -mkdir testDir $hdfs dfsadmin -setSpaceQuota 191M testDir $hdfs dfs -count -q testDir none inf 200278016 2002780161 0 0 hdfs://NN hostname:port/user/hdfsqa/testDir $hdfs dfs -put /etc/passwd /user/hadoopqa/testDir 11/09/19 08:08:15 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:365) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1100) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:972) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) Caused by: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at
[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365335#comment-14365335 ] Hudson commented on HDFS-5356: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) HDFS-5356. MiniDFSCluster should close all open FileSystems when shutdown() (Contributed by Rakesh R) (vinayakumarb: rev 018893e81ec1c43e6c79c77adec92c2edfb20cab) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365336#comment-14365336 ] Hudson commented on HDFS-7838: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) HDFS-7838. Expose truncate API for libhdfs. (yliu) (yliu: rev 48c2db34eff376c0f3a72587a5540b1e3dffafd2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/hdfs_web.c * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h Expose truncate API for libhdfs --- Key: HDFS-7838 URL: https://issues.apache.org/jira/browse/HDFS-7838 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch, HDFS-7838.003.patch, HDFS-7838.004.patch It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7948: - Attachment: HDFS-7948.00.patch TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7948: - Status: Patch Available (was: Open) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
Hui Zheng created HDFS-7949: --- Summary: WebImageViewer need support file size calculation with striped blocks Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Priority: Minor The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7827) Erasure Coding: support striped blocks in non-protobuf fsimage
[ https://issues.apache.org/jira/browse/HDFS-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-7827: Attachment: (was: HDFS-7827.001.patch) Erasure Coding: support striped blocks in non-protobuf fsimage -- Key: HDFS-7827 URL: https://issues.apache.org/jira/browse/HDFS-7827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Hui Zheng Attachments: HDFS-7827.000.patch, HDFS-7827.002.patch HDFS-7749 only adds code to persist striped blocks to protobuf-based fsimage. We should also add this support to the non-protobuf fsimage since it is still used for use cases like offline image processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7827) Erasure Coding: support striped blocks in non-protobuf fsimage
[ https://issues.apache.org/jira/browse/HDFS-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-7827: Attachment: (was: HDFS-7285.003.patch) Erasure Coding: support striped blocks in non-protobuf fsimage -- Key: HDFS-7827 URL: https://issues.apache.org/jira/browse/HDFS-7827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Hui Zheng Attachments: HDFS-7827.000.patch, HDFS-7827.002.patch HDFS-7749 only adds code to persist striped blocks to protobuf-based fsimage. We should also add this support to the non-protobuf fsimage since it is still used for use cases like offline image processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7827) Erasure Coding: support striped blocks in non-protobuf fsimage
[ https://issues.apache.org/jira/browse/HDFS-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-7827: Attachment: HDFS-7827.003.patch Erasure Coding: support striped blocks in non-protobuf fsimage -- Key: HDFS-7827 URL: https://issues.apache.org/jira/browse/HDFS-7827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Hui Zheng Attachments: HDFS-7827.000.patch, HDFS-7827.002.patch, HDFS-7827.003.patch HDFS-7749 only adds code to persist striped blocks to protobuf-based fsimage. We should also add this support to the non-protobuf fsimage since it is still used for use cases like offline image processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366652#comment-14366652 ] Walter Su commented on HDFS-7891: - 3rd. network topology of 1k nodes cannot compare to the metadata of 100m files. Block placement policy costs little CPU time, it will not become the bottleneck of Namenode. This is another reason I thought {{maxNodesPerRack}} method will be fine. A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE during tearDown() on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366581#comment-14366581 ] Hadoop QA commented on HDFS-7946: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705223/HDFS-7946.00.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9948//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9948//console This message is automatically generated. TestDataNodeVolumeFailureReporting NPE during tearDown() on Windows --- Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7881) TestHftpFileSystem#testSeek fails in branch-2
[ https://issues.apache.org/jira/browse/HDFS-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366604#comment-14366604 ] Brahma Reddy Battula commented on HDFS-7881: Test failures are unrelated this jira.. TestHftpFileSystem#testSeek fails in branch-2 - Key: HDFS-7881 URL: https://issues.apache.org/jira/browse/HDFS-7881 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Priority: Blocker Attachments: HDFS-7881.patch TestHftpFileSystem#testSeek fails in branch-2. {code} --- T E S T S --- Running org.apache.hadoop.hdfs.web.TestHftpFileSystem Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.201 sec FAILURE! - in org.apache.hadoop.hdfs.web.TestHftpFileSystem testSeek(org.apache.hadoop.hdfs.web.TestHftpFileSystem) Time elapsed: 0.054 sec ERROR! java.io.IOException: Content-Length is missing: {null=[HTTP/1.1 206 Partial Content], Date=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 GMT], Expires=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 GMT], Connection=[close], Content-Type=[text/plain; charset=utf-8], Server=[Jetty(6.1.26)], Content-Range=[bytes 7-9/10], Pragma=[no-cache, no-cache], Cache-Control=[no-cache]} at org.apache.hadoop.hdfs.web.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:132) at org.apache.hadoop.hdfs.web.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:104) at org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:181) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.web.TestHftpFileSystem.testSeek(TestHftpFileSystem.java:253) Results : Tests in error: TestHftpFileSystem.testSeek:253 ยป IO Content-Length is missing: {null=[HTTP/1 Tests run: 14, Failures: 0, Errors: 1, Skipped: 0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366605#comment-14366605 ] Walter Su commented on HDFS-7891: - Thanks [~zhz], the approach is much simple. Could you help me revert HADOOP-11676? I'll close that and rewrite the patch in HDFS-7891. This is What I thought: 1. If I decide to use sorted rack method. I have to optimized rack accessing in {{NetworkTopology.countNumOfAvailableNodes()}} and {{NetworkTopology.getNode()}} , and Add HADOOP-11676. I'm not very comfortable with that. 2. I think random policy with {{maxNodesPerRack}} works well for EC. If the cluster is small and the burden on Namenode is small, the worst case doesn't look so bad. If the cluster is midium/large, the policy won't hit worst case. You can check {{testresult.txt}} A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
Xiaoyu Yao created HDFS-7948: Summary: TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7829) Code clean up for LocatedBlock
[ https://issues.apache.org/jira/browse/HDFS-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366647#comment-14366647 ] Hadoop QA commented on HDFS-7829: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705237/HDFS-7829.4.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9950//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9950//console This message is automatically generated. Code clean up for LocatedBlock -- Key: HDFS-7829 URL: https://issues.apache.org/jira/browse/HDFS-7829 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jing Zhao Assignee: Takanobu Asanuma Priority: Minor Attachments: HDFS-7829.1.patch, HDFS-7829.2.patch, HDFS-7829.3.patch, HDFS-7829.4.patch We can do some code cleanup for {{LocatedBlock}}, including: # Using a simple Builder pattern to avoid multiple constructors # Setting data fields like {{corrupt}} and {{offset}} to final -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7827) Erasure Coding: support striped blocks in non-protobuf fsimage
[ https://issues.apache.org/jira/browse/HDFS-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HDFS-7827: Attachment: HDFS-7285.003.patch Hi Jing I updated the patch,please review it. 1) Corrected the {code}writeINodeUnderConstruction/readINodeUnderConstruction{code} method 2. Added a testcase for save/locad a INodeFileUnderConstruction. 3. Corrected the coding format. But I'm not sure whether they are all correct. If there are still code that is not according to the conventions please tell me the details. 4. About WebImageViewer, I create a new [HDFS-7949|https://issues.apache.org/jira/browse/HDFS-7949] Erasure Coding: support striped blocks in non-protobuf fsimage -- Key: HDFS-7827 URL: https://issues.apache.org/jira/browse/HDFS-7827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Hui Zheng Attachments: HDFS-7285.003.patch, HDFS-7827.000.patch, HDFS-7827.001.patch, HDFS-7827.002.patch HDFS-7749 only adds code to persist striped blocks to protobuf-based fsimage. We should also add this support to the non-protobuf fsimage since it is still used for use cases like offline image processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7946: Summary: TestDataNodeVolumeFailureReporting NPE on Windows (was: TestDataNodeVolumeFailureReporting NPE during tearDown() on Windows) TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7946: Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk through branch-2.7. Thanks for the contribution [~xyao]. TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366635#comment-14366635 ] Hudson commented on HDFS-7946: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7355 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7355/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7792) Add links to FaultInjectFramework and SLGUserGuide to site index
[ https://issues.apache.org/jira/browse/HDFS-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366655#comment-14366655 ] Masatake Iwasaki commented on HDFS-7792: Fault Injection Framework does not work in build with maven. Add links to FaultInjectFramework and SLGUserGuide to site index Key: HDFS-7792 URL: https://issues.apache.org/jira/browse/HDFS-7792 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor FaultInjectFramework.html SLGUserGuide.html are not linked from anywhere. Add link to them to site.xml if the contents are not outdated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7620) Change disk quota calculation for EC files
[ https://issues.apache.org/jira/browse/HDFS-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364645#comment-14364645 ] Takanobu Asanuma commented on HDFS-7620: I'd like to try to do this ticket. Please would you assign it to me? Thank you. Change disk quota calculation for EC files -- Key: HDFS-7620 URL: https://issues.apache.org/jira/browse/HDFS-7620 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze EC files has different disk space usage than replication. We need to change quota calculation to support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7717) Erasure Coding: provide a tool for convert files between replication and erasure coding
[ https://issues.apache.org/jira/browse/HDFS-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364666#comment-14364666 ] Jing Zhao commented on HDFS-7717: - Sure, assigning the jira to you. Erasure Coding: provide a tool for convert files between replication and erasure coding --- Key: HDFS-7717 URL: https://issues.apache.org/jira/browse/HDFS-7717 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Kai Sasaki We need a tool to do offline conversion between replication and erasure coding. The tool itself can either utilize MR just like the current distcp, or act like the balancer/mover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7938) OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac
[ https://issues.apache.org/jira/browse/HDFS-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364704#comment-14364704 ] Kiran Kumar M R commented on HDFS-7938: --- I am continuing discussion on [~cmccabe] comment https://issues.apache.org/jira/browse/HADOOP-11638?focusedCommentId=14364215page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14364215 bq. I looked at this and found that {{pthread_threadid_np}} on Mac has the type signature: {{int pthread_threadid_np(pthread_t thread, __uint64_t *thread_id)}} This doesn't match with using an {{unsigned long}}. I'm not sure under what conditions an unsigned long is different than a {{uint64_t}} on Mac (on Linux, that would be the case with 32-bit compilation). So this patch may have a buffer overflow in that case. I agree there may be a buffer overflow in case of 32-bit compilation on Mac. I went ahead with patch since 64-bit build was mostly used. I will submit a patch soon to use {{uint64_t}} and cast it to {{unsigned long}} OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac Key: HDFS-7938 URL: https://issues.apache.org/jira/browse/HDFS-7938 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Kiran Kumar M R Priority: Critical In OpensslSecureRandom.c, pthread_threadid_np is being used with an unsigned long, but the type signature requires a uint64_t. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.
[ https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7933: - Attachment: HDFS-7933.01.patch Fix the unit test failure from TestClientReportBadBlock due to fsck output string change. fsck should also report decommissioning replicas. -- Key: HDFS-7933 URL: https://issues.apache.org/jira/browse/HDFS-7933 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Jitendra Nath Pandey Assignee: Xiaoyu Yao Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch Fsck doesn't count replicas that are on decommissioning nodes. If a block has all replicas on the decommissioning nodes, it will be marked as missing, which is alarming for the admins, although the system will replicate them before nodes are decommissioned. Fsck output should also show decommissioning replicas along with the live replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7717) Erasure Coding: provide a tool for convert files between replication and erasure coding
[ https://issues.apache.org/jira/browse/HDFS-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7717: Assignee: Kai Sasaki (was: Jing Zhao) Erasure Coding: provide a tool for convert files between replication and erasure coding --- Key: HDFS-7717 URL: https://issues.apache.org/jira/browse/HDFS-7717 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Kai Sasaki We need a tool to do offline conversion between replication and erasure coding. The tool itself can either utilize MR just like the current distcp, or act like the balancer/mover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7687: Assignee: Takanobu Asanuma (was: Tsz Wo Nicholas Sze) Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Takanobu Asanuma We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times
[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keisuke Ogiwara updated HDFS-7645: -- Attachment: HDFS-7645.04.patch I have attached a new patch. Please review it when you are free. Thank you very much. Rolling upgrade is restoring blocks from trash multiple times - Key: HDFS-7645 URL: https://issues.apache.org/jira/browse/HDFS-7645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Assignee: Keisuke Ogiwara Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, HDFS-7645.03.patch, HDFS-7645.04.patch When performing an HDFS rolling upgrade, the trash directory is getting restored twice when under normal circumstances it shouldn't need to be restored at all. iiuc, the only time these blocks should be restored is if we need to rollback a rolling upgrade. On a busy cluster, this can cause significant and unnecessary block churn both on the datanodes, and more importantly in the namenode. The two times this happens are: 1) restart of DN onto new software {code} private void doTransition(DataNode datanode, StorageDirectory sd, NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { if (startOpt == StartupOption.ROLLBACK sd.getPreviousDir().exists()) { Preconditions.checkState(!getTrashRootDir(sd).exists(), sd.getPreviousDir() + and + getTrashRootDir(sd) + should not + both be present.); doRollback(sd, nsInfo); // rollback if applicable } else { // Restore all the files in the trash. The restored files are retained // during rolling upgrade rollback. They are deleted during rolling // upgrade downgrade. int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); LOG.info(Restored + restored + block files from trash.); } {code} 2) When heartbeat response no longer indicates a rollingupgrade is in progress {code} /** * Signal the current rolling upgrade status as indicated by the NN. * @param inProgress true if a rolling upgrade is in progress */ void signalRollingUpgrade(boolean inProgress) throws IOException { String bpid = getBlockPoolId(); if (inProgress) { dn.getFSDataset().enableTrash(bpid); dn.getFSDataset().setRollingUpgradeMarker(bpid); } else { dn.getFSDataset().restoreTrash(bpid); dn.getFSDataset().clearRollingUpgradeMarker(bpid); } } {code} HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7620) Change disk quota calculation for EC files
[ https://issues.apache.org/jira/browse/HDFS-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364668#comment-14364668 ] Takanobu Asanuma commented on HDFS-7620: Oh, thanks for your help, Jing. Change disk quota calculation for EC files -- Key: HDFS-7620 URL: https://issues.apache.org/jira/browse/HDFS-7620 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze EC files has different disk space usage than replication. We need to change quota calculation to support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7912) Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks
[ https://issues.apache.org/jira/browse/HDFS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364669#comment-14364669 ] Zhe Zhang commented on HDFS-7912: - Sure, let's revisit it later. +1 on the patch. Thanks Jing for the work! Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks -- Key: HDFS-7912 URL: https://issues.apache.org/jira/browse/HDFS-7912 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-7912.000.patch Now with striped blocks and the design that uses a single BlockInfoStriped object to track all the corresponding blocks, we need to clearly distinguish the type Block and BlockInfo in BlockManager. Specifically, data structures like {{UnderReplicatedBlocks}} and {{PendingReplicationBlocks}} should track BlockInfo instead of Block in order to support striped block recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()
[ https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364775#comment-14364775 ] Brahma Reddy Battula commented on HDFS-7261: Thanks a lot for review.. will update the patch.. storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState() --- Key: HDFS-7261 URL: https://issues.apache.org/jira/browse/HDFS-7261 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Brahma Reddy Battula Attachments: HDFS-7261-001.patch, HDFS-7261.patch Here is the code: {code} failedStorageInfos = new HashSetDatanodeStorageInfo( storageMap.values()); {code} In other places, the lock on DatanodeDescriptor.storageMap is held: {code} synchronized (storageMap) { final CollectionDatanodeStorageInfo storages = storageMap.values(); return storages.toArray(new DatanodeStorageInfo[storages.size()]); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7912) Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks
[ https://issues.apache.org/jira/browse/HDFS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364657#comment-14364657 ] Jing Zhao commented on HDFS-7912: - Yeah, the block contained in the LocatedBlock {{lBlk}} here is actually a BlockInfo object since it is created by {{BlockManager#createLocatedBlock}}. But currently I do not see an easy way to avoid this extra {{getStoredBlock}} call. How about revisiting this when we update some other fsck code for supporting striped blocks? Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks -- Key: HDFS-7912 URL: https://issues.apache.org/jira/browse/HDFS-7912 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-7912.000.patch Now with striped blocks and the design that uses a single BlockInfoStriped object to track all the corresponding blocks, we need to clearly distinguish the type Block and BlockInfo in BlockManager. Specifically, data structures like {{UnderReplicatedBlocks}} and {{PendingReplicationBlocks}} should track BlockInfo instead of Block in order to support striped block recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7938) OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac
[ https://issues.apache.org/jira/browse/HDFS-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kiran Kumar M R reassigned HDFS-7938: - Assignee: Kiran Kumar M R OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac Key: HDFS-7938 URL: https://issues.apache.org/jira/browse/HDFS-7938 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Kiran Kumar M R Priority: Critical In OpensslSecureRandom.c, pthread_threadid_np is being used with an unsigned long, but the type signature requires a uint64_t. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
[ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7916: Attachment: HDFS-7916-02.patch 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop -- Key: HDFS-7916 URL: https://issues.apache.org/jira/browse/HDFS-7916 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-7916-01.patch, HDFS-7916-02.patch if any badblock found, then BPSA for StandbyNode will go for infinite times to report it. {noformat}2015-03-11 19:43:41,528 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: stobdtserver3/10.224.54.70:18010 org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed to report bad block BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: at org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364675#comment-14364675 ] Takanobu Asanuma commented on HDFS-7687: I'd like to try to do this ticket. Please would you assign it to me? Thank you. Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
[ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364738#comment-14364738 ] Vinayakumar B commented on HDFS-7916: - latest patch attached is just to solve the jenkins problem. Not related to this issue, 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop -- Key: HDFS-7916 URL: https://issues.apache.org/jira/browse/HDFS-7916 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-7916-01.patch, HDFS-7916-02.patch if any badblock found, then BPSA for StandbyNode will go for infinite times to report it. {noformat}2015-03-11 19:43:41,528 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: stobdtserver3/10.224.54.70:18010 org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed to report bad block BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: at org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7838: - Resolution: Fixed Fix Version/s: 2.7.0 Target Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Colin for review. Committed to trunk, branch-2 and branch-2.7. Expose truncate API for libhdfs --- Key: HDFS-7838 URL: https://issues.apache.org/jira/browse/HDFS-7838 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch, HDFS-7838.003.patch, HDFS-7838.004.patch It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7471) TestDatanodeManager#testNumVersionsReportedCorrect occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364770#comment-14364770 ] Binglin Chang commented on HDFS-7471: - Hi [~szetszwo], I think the main concern of the patch is it may hide race condition problem, I see in the code the state is periodically refreshed by countSoftwareVersions, so temporary race condition caused count mismatch may not be a problem, or even expected. So I don't see much real damage here. Could you help review it? Thanks. TestDatanodeManager#testNumVersionsReportedCorrect occasionally fails - Key: HDFS-7471 URL: https://issues.apache.org/jira/browse/HDFS-7471 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Ted Yu Assignee: Binglin Chang Attachments: HDFS-7471.001.patch From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1957/ : {code} FAILED: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect Error Message: The map of version counts returned by DatanodeManager was not what it was expected to be on iteration 237 expected:0 but was:1 Stack Trace: java.lang.AssertionError: The map of version counts returned by DatanodeManager was not what it was expected to be on iteration 237 expected:0 but was:1 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect(TestDatanodeManager.java:150) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7620) Change disk quota calculation for EC files
[ https://issues.apache.org/jira/browse/HDFS-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-7620. - Resolution: Duplicate Hi [~tasanuma0829], I just find that this issue has been solved by HDFS-7826. I will resolve this as duplicated, but please feel free to create new jiras if you still see issues there. Change disk quota calculation for EC files -- Key: HDFS-7620 URL: https://issues.apache.org/jira/browse/HDFS-7620 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze EC files has different disk space usage than replication. We need to change quota calculation to support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364766#comment-14364766 ] Hudson commented on HDFS-7838: -- FAILURE: Integrated in Hadoop-trunk-Commit #7344 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7344/]) HDFS-7838. Expose truncate API for libhdfs. (yliu) (yliu: rev 48c2db34eff376c0f3a72587a5540b1e3dffafd2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/hdfs_web.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Expose truncate API for libhdfs --- Key: HDFS-7838 URL: https://issues.apache.org/jira/browse/HDFS-7838 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch, HDFS-7838.003.patch, HDFS-7838.004.patch It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7929) inotify unable fetch pre-upgrade edit log segments once upgrade starts
[ https://issues.apache.org/jira/browse/HDFS-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7929: Attachment: HDFS-7929-002.patch Thanks [~cmccabe] for the review. The new patch addresses the comment on listing files and adds a new unit test. Yes I agree we should file a followup JIRA examining journal manager. inotify unable fetch pre-upgrade edit log segments once upgrade starts -- Key: HDFS-7929 URL: https://issues.apache.org/jira/browse/HDFS-7929 Project: Hadoop HDFS Issue Type: Bug Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7929-000.patch, HDFS-7929-001.patch, HDFS-7929-002.patch inotify is often used to periodically poll HDFS events. However, once an HDFS upgrade has started, edit logs are moved to /previous on the NN, which is not accessible. Moreover, once the upgrade is finalized /previous is currently lost forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7697) Document the scope of the PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7697: - Status: Patch Available (was: Open) Document the scope of the PB OIV tool - Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7945) The WebHdfs system on DN does not honor the length parameter
[ https://issues.apache.org/jira/browse/HDFS-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7945: - Attachment: HDFS-7945.000.patch The WebHdfs system on DN does not honor the length parameter Key: HDFS-7945 URL: https://issues.apache.org/jira/browse/HDFS-7945 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Blocker Attachments: HDFS-7945.000.patch HDFS-7279 introduces a new WebHdfs server on the DN. The new server does not honor the length parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7811) Avoid recursive call getStoragePolicyID in INodeFile#computeQuotaUsage
[ https://issues.apache.org/jira/browse/HDFS-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366129#comment-14366129 ] Jing Zhao commented on HDFS-7811: - Thanks for working on this, Xiaoyu. Some comments below: # In {{tryGetBlockStoragePolicyId}}, looks like {{getStoragePolicyID}} will not throw {{UnsupportedOperationException}} in our current code? {code} try { storagePolicyId = getStoragePolicyID(); } catch (UnsupportedOperationException ex) { if (LOG.isDebugEnabled()) { LOG.debug(BlockStoragePolicy is not supported on + this.getClass()); } } {code} # We now have 4 different versions of {{computeQuotaUsage}} to support different parameter combinations. We can remove some of them for simplification. E.g., {{computeQuotaUsage(BlockStoragePolicySuite, byte, QuotaCounts, boolean)}} is only called in one place. # In {{INodeFile#computeQuotaUsage}}, we should check if the INodeFile has local storage policy and if so use it to overwrite the given policy. {code} byte storagePolicyId = (blockStoragePolicyId != BlockStoragePolicySuite.ID_UNSPECIFIED) ? blockStoragePolicyId : getStoragePolicyID(); {code} # Similar logic should also be used in INodeDirectory. In general, we should use the current directory's storage policy (if there is one) to overwrite the given policy and pass it down to its children. # For deleted files/directories in snapshots, we can do the same thing as a normal child file/directory. Thus simply passing the parent directory's storage policy down should be good enough. Handling a referrence INode can be more complicated, but we can do it in a separate jira. {code} -deleted.computeQuotaUsage(bsps, counts, false, Snapshot.CURRENT_STATE_ID); +deleted.computeQuotaUsage(bsps, deleted.tryGetBlockStoragePolicyId(), +counts, false, Snapshot.CURRENT_STATE_ID); {code} # Minor: there are couple of places where the line length exceeds 80 Avoid recursive call getStoragePolicyID in INodeFile#computeQuotaUsage -- Key: HDFS-7811 URL: https://issues.apache.org/jira/browse/HDFS-7811 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7811.00.patch This is a follow up based on comment from [~jingzhao] on HDFS-7723. I just noticed that INodeFile#computeQuotaUsage calls getStoragePolicyID to identify the storage policy id of the file. This may not be very efficient (especially when we're computing the quota usage of a directory) because getStoragePolicyID may recursively check the ancestral INode's storage policy. I think here an improvement can be passing the lowest parent directory's storage policy down while traversing the tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation
[ https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366135#comment-14366135 ] Jing Zhao commented on HDFS-7587: - Thanks for the comment, Kihwal. Actually I mentioned this in my previous comment: with fix from HDFS-7943 we will not have blocks with size greater than the preferred block size. But I will appreciate if you can also confirm. Edit log corruption can happen if append fails with a quota violation - Key: HDFS-7587 URL: https://issues.apache.org/jira/browse/HDFS-7587 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-7587.001.patch, HDFS-7587.002.patch, HDFS-7587.patch We have seen a standby namenode crashing due to edit log corruption. It was complaining that {{OP_CLOSE}} cannot be applied because the file is not under-construction. When a client was trying to append to the file, the remaining space quota was very small. This caused a failure in {{prepareFileForWrite()}}, but after the inode was already converted for writing and a lease added. Since these were not undone when the quota violation was detected, the file was left in under-construction with an active lease without edit logging {{OP_ADD}}. A subsequent {{append()}} eventually caused a lease recovery after the soft limit period. This resulted in {{commitBlockSynchronization()}}, which closed the file with {{OP_CLOSE}} being logged. Since there was no corresponding {{OP_ADD}}, edit replaying could not apply this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation
[ https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366182#comment-14366182 ] Kihwal Lee commented on HDFS-7587: -- Sorry, I missed the comment. Should have refreshed. Edit log corruption can happen if append fails with a quota violation - Key: HDFS-7587 URL: https://issues.apache.org/jira/browse/HDFS-7587 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-7587.001.patch, HDFS-7587.002.patch, HDFS-7587.patch We have seen a standby namenode crashing due to edit log corruption. It was complaining that {{OP_CLOSE}} cannot be applied because the file is not under-construction. When a client was trying to append to the file, the remaining space quota was very small. This caused a failure in {{prepareFileForWrite()}}, but after the inode was already converted for writing and a lease added. Since these were not undone when the quota violation was detected, the file was left in under-construction with an active lease without edit logging {{OP_ADD}}. A subsequent {{append()}} eventually caused a lease recovery after the soft limit period. This resulted in {{commitBlockSynchronization()}}, which closed the file with {{OP_CLOSE}} being logged. Since there was no corresponding {{OP_ADD}}, edit replaying could not apply this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads
[ https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7944: Status: Patch Available (was: Open) Minor cleanup of BlockPoolManager#getAllNamenodeThreads --- Key: HDFS-7944 URL: https://issues.apache.org/jira/browse/HDFS-7944 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-7944.01.patch {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an unmodifiable list. Since NN addition/removal is relatively rare we can just use a {{CopyOnWriteArrayList}} for concurrency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads
[ https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7944: Attachment: HDFS-7944.01.patch Minor cleanup of BlockPoolManager#getAllNamenodeThreads --- Key: HDFS-7944 URL: https://issues.apache.org/jira/browse/HDFS-7944 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-7944.01.patch {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an unmodifiable list. Since NN addition/removal is relatively rare we can just use a {{CopyOnWriteArrayList}} for concurrency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times
[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366067#comment-14366067 ] Hadoop QA commented on HDFS-7645: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705115/HDFS-7645.05.patch against trunk revision a89b087. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9937//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9937//console This message is automatically generated. Rolling upgrade is restoring blocks from trash multiple times - Key: HDFS-7645 URL: https://issues.apache.org/jira/browse/HDFS-7645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Assignee: Keisuke Ogiwara Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch When performing an HDFS rolling upgrade, the trash directory is getting restored twice when under normal circumstances it shouldn't need to be restored at all. iiuc, the only time these blocks should be restored is if we need to rollback a rolling upgrade. On a busy cluster, this can cause significant and unnecessary block churn both on the datanodes, and more importantly in the namenode. The two times this happens are: 1) restart of DN onto new software {code} private void doTransition(DataNode datanode, StorageDirectory sd, NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { if (startOpt == StartupOption.ROLLBACK sd.getPreviousDir().exists()) { Preconditions.checkState(!getTrashRootDir(sd).exists(), sd.getPreviousDir() + and + getTrashRootDir(sd) + should not + both be present.); doRollback(sd, nsInfo); // rollback if applicable } else { // Restore all the files in the trash. The restored files are retained // during rolling upgrade rollback. They are deleted during rolling // upgrade downgrade. int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); LOG.info(Restored + restored + block files from trash.); } {code} 2) When heartbeat response no longer indicates a rollingupgrade is in progress {code} /** * Signal the current rolling upgrade status as indicated by the NN. * @param inProgress true if a rolling upgrade is in progress */ void signalRollingUpgrade(boolean inProgress) throws IOException { String bpid = getBlockPoolId(); if (inProgress) { dn.getFSDataset().enableTrash(bpid); dn.getFSDataset().setRollingUpgradeMarker(bpid); } else { dn.getFSDataset().restoreTrash(bpid); dn.getFSDataset().clearRollingUpgradeMarker(bpid); } } {code} HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times
[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366086#comment-14366086 ] Kihwal Lee commented on HDFS-7645: -- It has been a while since I saw an hdfs precommit working and returning with all +1s. :) Rolling upgrade is restoring blocks from trash multiple times - Key: HDFS-7645 URL: https://issues.apache.org/jira/browse/HDFS-7645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Assignee: Keisuke Ogiwara Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch When performing an HDFS rolling upgrade, the trash directory is getting restored twice when under normal circumstances it shouldn't need to be restored at all. iiuc, the only time these blocks should be restored is if we need to rollback a rolling upgrade. On a busy cluster, this can cause significant and unnecessary block churn both on the datanodes, and more importantly in the namenode. The two times this happens are: 1) restart of DN onto new software {code} private void doTransition(DataNode datanode, StorageDirectory sd, NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { if (startOpt == StartupOption.ROLLBACK sd.getPreviousDir().exists()) { Preconditions.checkState(!getTrashRootDir(sd).exists(), sd.getPreviousDir() + and + getTrashRootDir(sd) + should not + both be present.); doRollback(sd, nsInfo); // rollback if applicable } else { // Restore all the files in the trash. The restored files are retained // during rolling upgrade rollback. They are deleted during rolling // upgrade downgrade. int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); LOG.info(Restored + restored + block files from trash.); } {code} 2) When heartbeat response no longer indicates a rollingupgrade is in progress {code} /** * Signal the current rolling upgrade status as indicated by the NN. * @param inProgress true if a rolling upgrade is in progress */ void signalRollingUpgrade(boolean inProgress) throws IOException { String bpid = getBlockPoolId(); if (inProgress) { dn.getFSDataset().enableTrash(bpid); dn.getFSDataset().setRollingUpgradeMarker(bpid); } else { dn.getFSDataset().restoreTrash(bpid); dn.getFSDataset().clearRollingUpgradeMarker(bpid); } } {code} HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7713) Improve the HDFS Web UI browser to allow creating dirs
[ https://issues.apache.org/jira/browse/HDFS-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366091#comment-14366091 ] Hadoop QA commented on HDFS-7713: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705125/HDFS-7713.06.patch against trunk revision 487374b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.util.TestByteArrayManager org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9938//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9938//console This message is automatically generated. Improve the HDFS Web UI browser to allow creating dirs -- Key: HDFS-7713 URL: https://issues.apache.org/jira/browse/HDFS-7713 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-7713.01.patch, HDFS-7713.02.patch, HDFS-7713.03.patch, HDFS-7713.04.patch, HDFS-7713.05.patch, HDFS-7713.06.patch This sub-task JIRA is for improving the NN HTML5 UI to allow the user to create directories. It uses WebHDFS and adds to the great work done in HDFS-6252 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7922) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors
[ https://issues.apache.org/jira/browse/HDFS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366168#comment-14366168 ] Hadoop QA commented on HDFS-7922: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704669/002-HDFS-7922.patch against trunk revision d884670. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9939//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9939//console This message is automatically generated. ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors - Key: HDFS-7922 URL: https://issues.apache.org/jira/browse/HDFS-7922 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Attachments: 001-HDFS-7922.patch, 002-HDFS-7922.patch ShortCircuitCache has the following executors. It would be good to shutdown these pools during ShortCircuitCache#close to avoid leaks. {code} /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor cleanerExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_Cleaner). build()); /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor releaserExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_SlotReleaser). build()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364850#comment-14364850 ] Tsz Wo Nicholas Sze commented on HDFS-7687: --- Thanks Takanobu. Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Takanobu Asanuma We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364864#comment-14364864 ] Vinayakumar B commented on HDFS-5356: - Committed to trunk and branch-2. Thanks for the contribution [~rakeshr]. Thanks [~cmccabe] for reviews. MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-5356: Resolution: Fixed Fix Version/s: 2.8.0 Target Version/s: (was: 2.4.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364915#comment-14364915 ] Hudson commented on HDFS-5356: -- FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/869/]) HDFS-5356. MiniDFSCluster should close all open FileSystems when shutdown() (Contributed by Rakesh R) (vinayakumarb: rev 018893e81ec1c43e6c79c77adec92c2edfb20cab) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2360) Ugly stacktrace when quota exceeds
[ https://issues.apache.org/jira/browse/HDFS-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364925#comment-14364925 ] Hudson commented on HDFS-2360: -- FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/869/]) HDFS-2360. Ugly stacktrce when quota exceeds. (harsh) (harsh: rev 046521cd6511b7fc6d9478cb2bed90d8e75fca20) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Ugly stacktrace when quota exceeds -- Key: HDFS-2360 URL: https://issues.apache.org/jira/browse/HDFS-2360 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 0.23.0 Reporter: Rajit Saha Assignee: Harsh J Priority: Minor Fix For: 2.8.0 Attachments: HDFS-2360.patch, HDFS-2360.patch Will it be better to catch the exception and throw a small reasonable messege to user when they exceed quota? $hdfs dfs -mkdir testDir $hdfs dfsadmin -setSpaceQuota 191M testDir $hdfs dfs -count -q testDir none inf 200278016 2002780161 0 0 hdfs://NN hostname:port/user/hdfsqa/testDir $hdfs dfs -put /etc/passwd /user/hadoopqa/testDir 11/09/19 08:08:15 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:365) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1100) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:972) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) Caused by: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/hdfsqa/testDir is exceeded: quota=191.0m diskspace consumed=768.0m at org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476) at
[jira] [Commented] (HDFS-7884) NullPointerException in BlockSender
[ https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364797#comment-14364797 ] Brahma Reddy Battula commented on HDFS-7884: Thanks a lot for review and comments..will look into this one..( sorry for late reply,, I was in leave..) NullPointerException in BlockSender --- Key: HDFS-7884 URL: https://issues.apache.org/jira/browse/HDFS-7884 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Brahma Reddy Battula Priority: Blocker Attachments: HDFS-7884.patch, h7884_20150313.patch, org.apache.hadoop.hdfs.TestAppendSnapshotTruncate-output.txt {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249) at java.lang.Thread.run(Thread.java:745) {noformat} BlockSender.java:264 is shown below {code} this.volumeRef = datanode.data.getVolume(block).obtainReference(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times
[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364882#comment-14364882 ] Vinayakumar B commented on HDFS-7645: - bq. DNs look for RollingUpgradeStatus in the heartbeat response. If it is absent then DNs infer that the rolling upgrade is finalized. If the administrator attempts to do a rollback without stopping all DNs first then clearing trash will cause data loss. Even though administrator does it by mistake, it will be a irrecoverable data loss. Just to avoid this, How about having the finalized {{RollingUpgradeStatus }} in the NameNode once the upgrade is finalized instead of making it null.? And in DNs we can check specifically check for the FINALIZED status before clearing the trash. Any thoughts ? Rolling upgrade is restoring blocks from trash multiple times - Key: HDFS-7645 URL: https://issues.apache.org/jira/browse/HDFS-7645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Assignee: Keisuke Ogiwara Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, HDFS-7645.03.patch, HDFS-7645.04.patch When performing an HDFS rolling upgrade, the trash directory is getting restored twice when under normal circumstances it shouldn't need to be restored at all. iiuc, the only time these blocks should be restored is if we need to rollback a rolling upgrade. On a busy cluster, this can cause significant and unnecessary block churn both on the datanodes, and more importantly in the namenode. The two times this happens are: 1) restart of DN onto new software {code} private void doTransition(DataNode datanode, StorageDirectory sd, NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { if (startOpt == StartupOption.ROLLBACK sd.getPreviousDir().exists()) { Preconditions.checkState(!getTrashRootDir(sd).exists(), sd.getPreviousDir() + and + getTrashRootDir(sd) + should not + both be present.); doRollback(sd, nsInfo); // rollback if applicable } else { // Restore all the files in the trash. The restored files are retained // during rolling upgrade rollback. They are deleted during rolling // upgrade downgrade. int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); LOG.info(Restored + restored + block files from trash.); } {code} 2) When heartbeat response no longer indicates a rollingupgrade is in progress {code} /** * Signal the current rolling upgrade status as indicated by the NN. * @param inProgress true if a rolling upgrade is in progress */ void signalRollingUpgrade(boolean inProgress) throws IOException { String bpid = getBlockPoolId(); if (inProgress) { dn.getFSDataset().enableTrash(bpid); dn.getFSDataset().setRollingUpgradeMarker(bpid); } else { dn.getFSDataset().restoreTrash(bpid); dn.getFSDataset().clearRollingUpgradeMarker(bpid); } } {code} HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7939) Two fsimage_rollback_* files are created which are not deleted after rollback.
[ https://issues.apache.org/jira/browse/HDFS-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364884#comment-14364884 ] J.Andreina commented on HDFS-7939: -- Step 1: NN1 is active , NN2 is standby. Step 2: Perform hdfs dfsadmin rollingUpgrade prepare Step 3: Active NN1 gone down. {noformat} NN1: -rw-r--r-- 1 Rex users 67 Mar 17 17:35 edits_001-003 -rw-r--r-- 1 Rex users 1048576 Mar 17 17:35 edits_inprogress_004 -rw-r--r-- 1 Rex users 350 Mar 17 17:33 fsimage_000 -rw-r--r-- 1 Rex users 62 Mar 17 17:33 fsimage_000.md5 -rw-r--r-- 1 Rex users 2 Mar 17 17:35 seen_txid -rw-r--r-- 1 Rex users 206 Mar 17 17:33 VERSION NN2: -rw-r--r-- 1 Rex users 1048576 Mar 17 17:38 edits_inprogress_005 -rw-r--r-- 1 Rex users 349 Mar 17 17:37 fsimage_000 -rw-r--r-- 1 Rex users 62 Mar 17 17:37 fsimage_000.md5 -rw-r--r-- 1 Rex users 2 Mar 17 17:37 seen_txid -rw-r--r-- 1 Rex users 205 Mar 17 17:37 VERSION {noformat} Step 4: Restart NN2 with rollingUpgrade started option. ( Created fsimage_rollback_004, closed txn 5 and NN2 became Active. But not able to upload to NN1.) Step 5: Restart NN1 with rollingUpgrade started option. ( NN1 became standby) Issue : === NN1 did checkpoint for one extra txn ( id: 5) and uploaded one more fsimage_rollback_005 to NN2 On rollback , NN2 deletes only fsimage_rollback_005 , leaving behind fsimage_rollback_004 without deleting. {noformat} NN2 : -rw-r--r-- 1 Rex users 1048576 Mar 17 17:38 edits_005-005 -rw-r--r-- 1 Rex users 1048576 Mar 17 17:39 edits_inprogress_006 -rw-r--r-- 1 Rex users 349 Mar 17 17:37 fsimage_000 -rw-r--r-- 1 Rex users 62 Mar 17 17:37 fsimage_000.md5 -rw-r--r-- 1 Rex users 356 Mar 17 17:39 fsimage_rollback_004 -rw-r--r-- 1 Rex users 71 Mar 17 17:39 fsimage_rollback_004.md5 -rw-r--r-- 1 Rex users 356 Mar 17 17:39 fsimage_rollback_005 -rw-r--r-- 1 Rex users 71 Mar 17 17:39 fsimage_rollback_005.md5 -rw-r--r-- 1 Rex users 2 Mar 17 17:37 seen_txid -rw-r--r-- 1 Rex users 205 Mar 17 17:39 VERSION NN1 : -rw-r--r-- 1 Rex users 67 Mar 17 17:38 edits_001-003 -rw-r--r-- 1 Rex users 1048576 Mar 17 17:38 edits_inprogress_004 -rw-r--r-- 1 Rex users 349 Mar 17 17:36 fsimage_000 -rw-r--r-- 1 Rex users 62 Mar 17 17:36 fsimage_000.md5 -rw-r--r-- 1 Rex users 356 Mar 17 17:39 fsimage_rollback_005 -rw-r--r-- 1 Rex users 71 Mar 17 17:39 fsimage_rollback_005.md5 -rw-r--r-- 1 Rex users 2 Mar 17 17:38 seen_txid -rw-r--r-- 1 Rex users 205 Mar 17 17:39 VERSION {noformat} Two fsimage_rollback_* files are created which are not deleted after rollback. -- Key: HDFS-7939 URL: https://issues.apache.org/jira/browse/HDFS-7939 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical During checkpoint , if any failure in uploading to the remote Namenode then restarting Namenode with rollingUpgrade started option creates 2 fsimage_rollback_* at Active Namenode . On rolling upgrade rollback , initially created fsimage_rollback_* file is not been deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times
[ https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364882#comment-14364882 ] Vinayakumar B edited comment on HDFS-7645 at 3/17/15 10:26 AM: --- bq. DNs look for RollingUpgradeStatus in the heartbeat response. If it is absent then DNs infer that the rolling upgrade is finalized. If the administrator attempts to do a rollback without stopping all DNs first then clearing trash will cause data loss. Even though administrator does it by mistake, it will be a irrecoverable data loss. Just to avoid this, How about having the finalized {{RollingUpgradeStatus}} in the NameNode once the upgrade is finalized instead of making it null.? And in DNs we can check specifically check for the FINALIZED status before clearing the trash. Any thoughts ? was (Author: vinayrpet): bq. DNs look for RollingUpgradeStatus in the heartbeat response. If it is absent then DNs infer that the rolling upgrade is finalized. If the administrator attempts to do a rollback without stopping all DNs first then clearing trash will cause data loss. Even though administrator does it by mistake, it will be a irrecoverable data loss. Just to avoid this, How about having the finalized {{RollingUpgradeStatus }} in the NameNode once the upgrade is finalized instead of making it null.? And in DNs we can check specifically check for the FINALIZED status before clearing the trash. Any thoughts ? Rolling upgrade is restoring blocks from trash multiple times - Key: HDFS-7645 URL: https://issues.apache.org/jira/browse/HDFS-7645 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Nathan Roberts Assignee: Keisuke Ogiwara Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, HDFS-7645.03.patch, HDFS-7645.04.patch When performing an HDFS rolling upgrade, the trash directory is getting restored twice when under normal circumstances it shouldn't need to be restored at all. iiuc, the only time these blocks should be restored is if we need to rollback a rolling upgrade. On a busy cluster, this can cause significant and unnecessary block churn both on the datanodes, and more importantly in the namenode. The two times this happens are: 1) restart of DN onto new software {code} private void doTransition(DataNode datanode, StorageDirectory sd, NamespaceInfo nsInfo, StartupOption startOpt) throws IOException { if (startOpt == StartupOption.ROLLBACK sd.getPreviousDir().exists()) { Preconditions.checkState(!getTrashRootDir(sd).exists(), sd.getPreviousDir() + and + getTrashRootDir(sd) + should not + both be present.); doRollback(sd, nsInfo); // rollback if applicable } else { // Restore all the files in the trash. The restored files are retained // during rolling upgrade rollback. They are deleted during rolling // upgrade downgrade. int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd)); LOG.info(Restored + restored + block files from trash.); } {code} 2) When heartbeat response no longer indicates a rollingupgrade is in progress {code} /** * Signal the current rolling upgrade status as indicated by the NN. * @param inProgress true if a rolling upgrade is in progress */ void signalRollingUpgrade(boolean inProgress) throws IOException { String bpid = getBlockPoolId(); if (inProgress) { dn.getFSDataset().enableTrash(bpid); dn.getFSDataset().setRollingUpgradeMarker(bpid); } else { dn.getFSDataset().restoreTrash(bpid); dn.getFSDataset().clearRollingUpgradeMarker(bpid); } } {code} HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely clear whether this is somehow intentional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364868#comment-14364868 ] Hudson commented on HDFS-5356: -- FAILURE: Integrated in Hadoop-trunk-Commit #7346 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7346/]) HDFS-5356. MiniDFSCluster should close all open FileSystems when shutdown() (Contributed by Rakesh R) (vinayakumarb: rev 018893e81ec1c43e6c79c77adec92c2edfb20cab) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7939) Two fsimage_rollback_* files are created which are not deleted after rollback.
J.Andreina created HDFS-7939: Summary: Two fsimage_rollback_* files are created which are not deleted after rollback. Key: HDFS-7939 URL: https://issues.apache.org/jira/browse/HDFS-7939 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical During checkpoint , if any failure in uploading to the remote Namenode then restarting Namenode with rollingUpgrade started option creates 2 fsimage_rollback_* at Active Namenode . On rolling upgrade rollback , initially created fsimage_rollback_* file is not been deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5356) MiniDFSCluster shoud close all open FileSystems when shutdown()
[ https://issues.apache.org/jira/browse/HDFS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364903#comment-14364903 ] Rakesh R commented on HDFS-5356: Thanks [~cmccabe] for the reviews. Thanks [~vinayrpet] for reviewing and committing the patch. MiniDFSCluster shoud close all open FileSystems when shutdown() --- Key: HDFS-5356 URL: https://issues.apache.org/jira/browse/HDFS-5356 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.2.0 Reporter: haosdent Assignee: Rakesh R Priority: Critical Fix For: 2.8.0 Attachments: HDFS-5356-1.patch, HDFS-5356-2.patch, HDFS-5356-3.patch, HDFS-5356-4.patch, HDFS-5356-5.patch, HDFS-5356-6.patch, HDFS-5356-7.patch, HDFS-5356-8.patch, HDFS-5356.patch After add some metrics functions to DFSClient, I found that some unit tests relates to metrics are failed. Because MiniDFSCluster are never close open FileSystems, DFSClients are alive after MiniDFSCluster shutdown(). The metrics of DFSClients in DefaultMetricsSystem are still exist and this make other unit tests failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7838) Expose truncate API for libhdfs
[ https://issues.apache.org/jira/browse/HDFS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364916#comment-14364916 ] Hudson commented on HDFS-7838: -- FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/869/]) HDFS-7838. Expose truncate API for libhdfs. (yliu) (yliu: rev 48c2db34eff376c0f3a72587a5540b1e3dffafd2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/hdfs_web.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h Expose truncate API for libhdfs --- Key: HDFS-7838 URL: https://issues.apache.org/jira/browse/HDFS-7838 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7838.001.patch, HDFS-7838.002.patch, HDFS-7838.003.patch, HDFS-7838.004.patch It's good to expose truncate in libhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
[ https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364908#comment-14364908 ] Hudson commented on HDFS-7886: -- FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/869/]) HDFS-7886. Fix TestFileTruncate falures. Contributed by Plamen Jeliazkov and Konstantin Shvachko. (shv: rev ce5de93a5837e115e1f0b7d3c5a67ace25385a63) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes Key: HDFS-7886 URL: https://issues.apache.org/jira/browse/HDFS-7886 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Plamen Jeliazkov Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7886-01.patch, HDFS-7886-02.patch, HDFS-7886-branch2.patch, HDFS-7886.patch https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7884) NullPointerException in BlockSender
[ https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364799#comment-14364799 ] Brahma Reddy Battula commented on HDFS-7884: :).. NullPointerException in BlockSender --- Key: HDFS-7884 URL: https://issues.apache.org/jira/browse/HDFS-7884 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Brahma Reddy Battula Priority: Blocker Attachments: HDFS-7884.patch, h7884_20150313.patch, org.apache.hadoop.hdfs.TestAppendSnapshotTruncate-output.txt {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249) at java.lang.Thread.run(Thread.java:745) {noformat} BlockSender.java:264 is shown below {code} this.volumeRef = datanode.data.getVolume(block).obtainReference(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7938) OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac
[ https://issues.apache.org/jira/browse/HDFS-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365638#comment-14365638 ] Colin Patrick McCabe commented on HDFS-7938: Thanks, [~kiranmr]. OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac Key: HDFS-7938 URL: https://issues.apache.org/jira/browse/HDFS-7938 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Kiran Kumar M R Priority: Critical In OpensslSecureRandom.c, pthread_threadid_np is being used with an unsigned long, but the type signature requires a uint64_t. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365637#comment-14365637 ] Colin Patrick McCabe commented on HDFS-7940: +1. Thanks, [~rakeshr]. Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365635#comment-14365635 ] Hadoop QA commented on HDFS-7930: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705086/HDFS-7930.002.patch against trunk revision 7179f94. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFileTruncate Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9921//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9921//console This message is automatically generated. commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7942) NFS: regex value of nfs.exports.allowed.hosts is not working as expected
[ https://issues.apache.org/jira/browse/HDFS-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-7942: - Description: Thanks, [~yeshavora], for reporting this problem. Set regex value in nfs.exports.allowed.hosts property. {noformat} propertynamenfs.exports.allowed.hosts/namevalue206.190.52.[26|23] rw/value/property {noformat} With this value, neither 206.190.52.26 nor 206.190.52.23 can mount nfs and act as nfs client. In conclusion, no host can mount nfs with this regex value due to access denied error. {noformat} $ sudo su - -c mount -o soft,proto=tcp,vers=3,rsize=1048576,wsize=1048576,nolock 206.190.52.23:/ /tmp/tmp_mnt root mount.nfs: access denied by server while mounting 206.190.52.23:/ {noformat} was: set regex value in nfs.exports.allowed.hosts property. {noformat} propertynamenfs.exports.allowed.hosts/namevalue206.190.52.[26|23] rw/value/property {noformat} With this value, neither 206.190.52.26 nor 206.190.52.23 can mount nfs and act as nfs client. In conclusion, no host can mount nfs with this regex value due to access denied error. {noformat} $ sudo su - -c mount -o soft,proto=tcp,vers=3,rsize=1048576,wsize=1048576,nolock 206.190.52.23:/ /tmp/tmp_mnt root mount.nfs: access denied by server while mounting 206.190.52.23:/ {noformat} NFS: regex value of nfs.exports.allowed.hosts is not working as expected Key: HDFS-7942 URL: https://issues.apache.org/jira/browse/HDFS-7942 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.6.0 Reporter: Brandon Li Assignee: Brandon Li Thanks, [~yeshavora], for reporting this problem. Set regex value in nfs.exports.allowed.hosts property. {noformat} propertynamenfs.exports.allowed.hosts/namevalue206.190.52.[26|23] rw/value/property {noformat} With this value, neither 206.190.52.26 nor 206.190.52.23 can mount nfs and act as nfs client. In conclusion, no host can mount nfs with this regex value due to access denied error. {noformat} $ sudo su - -c mount -o soft,proto=tcp,vers=3,rsize=1048576,wsize=1048576,nolock 206.190.52.23:/ /tmp/tmp_mnt root mount.nfs: access denied by server while mounting 206.190.52.23:/ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads
[ https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7944: Description: {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an unmodifiable list. Since NN addition/removal is relatively rare we can just use a {{CopyOnWriteArrayList}} for concurrency. was:{{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an UnmodifiableList. Since NN addition/removal is relatively rare we can just use a CopyOnWriteArrayList for concurrency. Minor cleanup of BlockPoolManager#getAllNamenodeThreads --- Key: HDFS-7944 URL: https://issues.apache.org/jira/browse/HDFS-7944 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an unmodifiable list. Since NN addition/removal is relatively rare we can just use a {{CopyOnWriteArrayList}} for concurrency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7713) Improve the HDFS Web UI browser to allow creating dirs
[ https://issues.apache.org/jira/browse/HDFS-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7713: --- Attachment: HDFS-7713.06.patch Thanks for your detailed review Haohui! Much appreciated. Sorry I had been OOO last week. Here's a patch which incorporates all your feedback. Improve the HDFS Web UI browser to allow creating dirs -- Key: HDFS-7713 URL: https://issues.apache.org/jira/browse/HDFS-7713 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-7713.01.patch, HDFS-7713.02.patch, HDFS-7713.03.patch, HDFS-7713.04.patch, HDFS-7713.05.patch, HDFS-7713.06.patch This sub-task JIRA is for improving the NN HTML5 UI to allow the user to create directories. It uses WebHDFS and adds to the great work done in HDFS-6252 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7943) Append cannot handle the last block with length greater than the preferred block size
[ https://issues.apache.org/jira/browse/HDFS-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365720#comment-14365720 ] Jing Zhao commented on HDFS-7943: - We have two options to fix: # Change append to support blocks with size greater than the preferred block size. Specifically, we can update the logic of {{BlockManager#convertLastBlockToUnderConstruction}}: if the last block size is greater than or equal to the preferred size, the client should write to a new block. Since the same function is also used by truncate, the new logic should also support the scenario that data is truncated from such block. # Or we can add a new restriction to concat: the source file's preferred block size cannot be greater than the target. This fix is simpler and keeps the semantic that the preferred block size is the upper limit of a file's blocks. Append cannot handle the last block with length greater than the preferred block size - Key: HDFS-7943 URL: https://issues.apache.org/jira/browse/HDFS-7943 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker In HDFS-3689, we remove the restriction from concat that all the source files should have the same preferred block size with the target file. This can cause a file to contain blocks with size larger than its preferred block size. If such block happens to be the last block of a file, and later we append data to the file without the {{CreateFlag.NEW_BLOCK}} flag (i.e., appending data to the last block), looks like the current client code will keep writing to the last block and never allocate a new block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads
[ https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7944: Issue Type: Improvement (was: Bug) Minor cleanup of BlockPoolManager#getAllNamenodeThreads --- Key: HDFS-7944 URL: https://issues.apache.org/jira/browse/HDFS-7944 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an UnmodifiableList. Since NN addition/removal is relatively rare we can just use a CopyOnWriteArrayList for concurrency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7945) The WebHdfs system on DN does not honor the length parameter
[ https://issues.apache.org/jira/browse/HDFS-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365780#comment-14365780 ] Haohui Mai commented on HDFS-7945: -- This is a regression thus it is a blocker. The WebHdfs system on DN does not honor the length parameter Key: HDFS-7945 URL: https://issues.apache.org/jira/browse/HDFS-7945 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Blocker HDFS-7279 introduces a new WebHdfs server on the DN. The new server does not honor the length parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7945) The WebHdfs system on DN does not honor the length parameter
Haohui Mai created HDFS-7945: Summary: The WebHdfs system on DN does not honor the length parameter Key: HDFS-7945 URL: https://issues.apache.org/jira/browse/HDFS-7945 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Blocker HDFS-7279 introduces a new WebHdfs server on the DN. The new server does not honor the length parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7927) Fluentd unable to write events to MaprFS using httpfs
[ https://issues.apache.org/jira/browse/HDFS-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7927: --- Status: Open (was: Patch Available) cancelling patch as it does not apply. Fluentd unable to write events to MaprFS using httpfs - Key: HDFS-7927 URL: https://issues.apache.org/jira/browse/HDFS-7927 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Environment: mapr 4.0.1 Reporter: Roman Slysh Attachments: HDFS-7927.patch The issue is on MaprFS file system. Probably, can be reproduced on HDFS, but not sure. We have observed in td-agent log whenever webhdfs plugin call to flush events its calling append instead of create file on maprfs by communicating with webhdfs. We need to modify this plugin to create file and then append data to the file as manually creating file is not a solution as lot of log events write to Filesystem they need to rotate on timely basis. http://docs.fluentd.org/articles/http-to-hdfs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365818#comment-14365818 ] Konstantin Shvachko commented on HDFS-7930: --- Looks good now. One last thing is that we should add a logging message in {{markBlockReplicasAsCorrupt()}} reporting which locations are being marked as corrupt. We can do it on the info level. This will help finding problems, if there are any. Also, the test failure is related to the new code. The very first {{checkBlockRecovery()}} times out. I did not look deep, but it could be because we only have 3 DNs. One replica out of three is corrupt, but there are no more targets to replicate it before the corrupt replica can be removed. The log message would have helped. commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7922) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors
[ https://issues.apache.org/jira/browse/HDFS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7922: --- Comment: was deleted (was: Please tell me how i can use batch file HDFS-1783 https://issues.apache.org/jira/browse/HDFS-1783 Please) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors - Key: HDFS-7922 URL: https://issues.apache.org/jira/browse/HDFS-7922 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Attachments: 001-HDFS-7922.patch, 002-HDFS-7922.patch ShortCircuitCache has the following executors. It would be good to shutdown these pools during ShortCircuitCache#close to avoid leaks. {code} /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor cleanerExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_Cleaner). build()); /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor releaserExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_SlotReleaser). build()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7943) Append cannot handle the last block with length greater than the preferred block size
[ https://issues.apache.org/jira/browse/HDFS-7943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7943: Status: Patch Available (was: Open) Append cannot handle the last block with length greater than the preferred block size - Key: HDFS-7943 URL: https://issues.apache.org/jira/browse/HDFS-7943 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-7943.000.patch In HDFS-3689, we remove the restriction from concat that all the source files should have the same preferred block size with the target file. This can cause a file to contain blocks with size larger than its preferred block size. If such block happens to be the last block of a file, and later we append data to the file without the {{CreateFlag.NEW_BLOCK}} flag (i.e., appending data to the last block), looks like the current client code will keep writing to the last block and never allocate a new block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads
[ https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal moved HADOOP-11725 to HDFS-7944: -- Target Version/s: 2.8.0 (was: 2.8.0) Affects Version/s: (was: 2.6.0) 2.6.0 Key: HDFS-7944 (was: HADOOP-11725) Project: Hadoop HDFS (was: Hadoop Common) Minor cleanup of BlockPoolManager#getAllNamenodeThreads --- Key: HDFS-7944 URL: https://issues.apache.org/jira/browse/HDFS-7944 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to array conversion and vice versa by returning an UnmodifiableList. Since NN addition/removal is relatively rare we can just use a CopyOnWriteArrayList for concurrency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7054) Make DFSOutputStream tracing more fine-grained
[ https://issues.apache.org/jira/browse/HDFS-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365801#comment-14365801 ] Colin Patrick McCabe commented on HDFS-7054: The new findbugs warnings are all from YARN, which is not changed in this patch: {code} IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadListener; locked 50% of time IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocConf; locked 80% of time IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.fsOpDurations; locked 66% of time IS Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.updateInterval; locked 75% of time {code} similarly none of the new warnings are in files changed by this patch, so they seem like false positive. Make DFSOutputStream tracing more fine-grained -- Key: HDFS-7054 URL: https://issues.apache.org/jira/browse/HDFS-7054 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7054.001.patch, HDFS-7054.002.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7912) Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks
[ https://issues.apache.org/jira/browse/HDFS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-7912. - Resolution: Fixed Fix Version/s: HDFS-7285 Hadoop Flags: Reviewed Thanks for the review, Zhe! I've committed this. Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks -- Key: HDFS-7912 URL: https://issues.apache.org/jira/browse/HDFS-7912 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Fix For: HDFS-7285 Attachments: HDFS-7912.000.patch Now with striped blocks and the design that uses a single BlockInfoStriped object to track all the corresponding blocks, we need to clearly distinguish the type Block and BlockInfo in BlockManager. Specifically, data structures like {{UnderReplicatedBlocks}} and {{PendingReplicationBlocks}} should track BlockInfo instead of Block in order to support striped block recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7942) NFS: regex value of nfs.exports.allowed.hosts is not working as expected
Brandon Li created HDFS-7942: Summary: NFS: regex value of nfs.exports.allowed.hosts is not working as expected Key: HDFS-7942 URL: https://issues.apache.org/jira/browse/HDFS-7942 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.6.0 Reporter: Brandon Li Assignee: Brandon Li set regex value in nfs.exports.allowed.hosts property. {noformat} propertynamenfs.exports.allowed.hosts/namevalue206.190.52.[26|23] rw/value/property {noformat} With this value, neither 206.190.52.26 nor 206.190.52.23 can mount nfs and act as nfs client. In conclusion, no host can mount nfs with this regex value due to access denied error. {noformat} $ sudo su - -c mount -o soft,proto=tcp,vers=3,rsize=1048576,wsize=1048576,nolock 206.190.52.23:/ /tmp/tmp_mnt root mount.nfs: access denied by server while mounting 206.190.52.23:/ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7940: --- Resolution: Fixed Fix Version/s: 2.7.0 Target Version/s: 2.7.0 Status: Resolved (was: Patch Available) Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7922) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors
[ https://issues.apache.org/jira/browse/HDFS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7922: --- Comment: was deleted (was: bq.Please tell me how i can use batch file HDFS-1783 [~Rosa] It looks like the comment is unrelated to this issue.) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors - Key: HDFS-7922 URL: https://issues.apache.org/jira/browse/HDFS-7922 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Attachments: 001-HDFS-7922.patch, 002-HDFS-7922.patch ShortCircuitCache has the following executors. It would be good to shutdown these pools during ShortCircuitCache#close to avoid leaks. {code} /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor cleanerExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_Cleaner). build()); /** * The executor service that runs the cacheCleaner. */ private final ScheduledThreadPoolExecutor releaserExecutor = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder(). setDaemon(true).setNameFormat(ShortCircuitCache_SlotReleaser). build()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7943) Append cannot handle the last block with length greater than the preferred block size
Jing Zhao created HDFS-7943: --- Summary: Append cannot handle the last block with length greater than the preferred block size Key: HDFS-7943 URL: https://issues.apache.org/jira/browse/HDFS-7943 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker In HDFS-3689, we remove the restriction from concat that all the source files should have the same preferred block size with the target file. This can cause a file to contain blocks with size larger than its preferred block size. If such block happens to be the last block of a file, and later we append data to the file without the {{CreateFlag.NEW_BLOCK}} flag (i.e., appending data to the last block), looks like the current client code will keep writing to the last block and never allocate a new block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation
[ https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reassigned HDFS-7587: --- Assignee: Jing Zhao Edit log corruption can happen if append fails with a quota violation - Key: HDFS-7587 URL: https://issues.apache.org/jira/browse/HDFS-7587 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-7587.001.patch, HDFS-7587.patch We have seen a standby namenode crashing due to edit log corruption. It was complaining that {{OP_CLOSE}} cannot be applied because the file is not under-construction. When a client was trying to append to the file, the remaining space quota was very small. This caused a failure in {{prepareFileForWrite()}}, but after the inode was already converted for writing and a lease added. Since these were not undone when the quota violation was detected, the file was left in under-construction with an active lease without edit logging {{OP_ADD}}. A subsequent {{append()}} eventually caused a lease recovery after the soft limit period. This resulted in {{commitBlockSynchronization()}}, which closed the file with {{OP_CLOSE}} being logged. Since there was no corresponding {{OP_ADD}}, edit replaying could not apply this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7587) Edit log corruption can happen if append fails with a quota violation
[ https://issues.apache.org/jira/browse/HDFS-7587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7587: Attachment: HDFS-7587.001.patch Rebase Daryn's patch. Also make changes based on Nicholas's comments, i.e., first verifying the quota and updating the quota after the action. With fix from HDFS-7943 we will not have blocks with size greater than the preferred block size. Thus we can avoid earning back quota scenarios. Truncate may have similar issue when the data to truncate is only part of the original last block. Will update the patch later to fix this part. Edit log corruption can happen if append fails with a quota violation - Key: HDFS-7587 URL: https://issues.apache.org/jira/browse/HDFS-7587 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Kihwal Lee Priority: Blocker Attachments: HDFS-7587.001.patch, HDFS-7587.patch We have seen a standby namenode crashing due to edit log corruption. It was complaining that {{OP_CLOSE}} cannot be applied because the file is not under-construction. When a client was trying to append to the file, the remaining space quota was very small. This caused a failure in {{prepareFileForWrite()}}, but after the inode was already converted for writing and a lease added. Since these were not undone when the quota violation was detected, the file was left in under-construction with an active lease without edit logging {{OP_ADD}}. A subsequent {{append()}} eventually caused a lease recovery after the soft limit period. This resulted in {{commitBlockSynchronization()}}, which closed the file with {{OP_CLOSE}} being logged. Since there was no corresponding {{OP_ADD}}, edit replaying could not apply this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7927) Fluentd unable to write events to MaprFS using httpfs
[ https://issues.apache.org/jira/browse/HDFS-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7927: --- Fix Version/s: (was: 2.4.1) Fluentd unable to write events to MaprFS using httpfs - Key: HDFS-7927 URL: https://issues.apache.org/jira/browse/HDFS-7927 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.1 Environment: mapr 4.0.1 Reporter: Roman Slysh Attachments: HDFS-7927.patch The issue is on MaprFS file system. Probably, can be reproduced on HDFS, but not sure. We have observed in td-agent log whenever webhdfs plugin call to flush events its calling append instead of create file on maprfs by communicating with webhdfs. We need to modify this plugin to create file and then append data to the file as manually creating file is not a solution as lot of log events write to Filesystem they need to rotate on timely basis. http://docs.fluentd.org/articles/http-to-hdfs -- This message was sent by Atlassian JIRA (v6.3.4#6332)