[jira] [Updated] (HDFS-8849) fsck should report number of missing blocks with replication factor 1
[ https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8849: - Target Version/s: (was: 2.8.0) > fsck should report number of missing blocks with replication factor 1 > - > > Key: HDFS-8849 > URL: https://issues.apache.org/jira/browse/HDFS-8849 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > > HDFS-7165 supports reporting number of blocks with replication factor 1 in > {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same > support, which is the aim of this JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7273) Add start and stop wrapper scripts for mover
[ https://issues.apache.org/jira/browse/HDFS-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7273: - Target Version/s: 2.9.0 (was: 2.8.0) > Add start and stop wrapper scripts for mover > > > Key: HDFS-7273 > URL: https://issues.apache.org/jira/browse/HDFS-7273 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: 2.6.0 >Reporter: Benoy Antony >Assignee: Benoy Antony >Priority: Minor > > Similar to balancer , we need start/stop mover scripts to run data migration > tool as a daemon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6658: - Labels: (was: BB2015-05-TBR) > Namenode memory optimization - Block replicas list > --- > > Key: HDFS-6658 > URL: https://issues.apache.org/jira/browse/HDFS-6658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Daryn Sharp > Attachments: BlockListOptimizationComparison.xlsx, BlocksMap > redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode > Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, > Old triplets.jpg > > > Part of the memory consumed by every BlockInfo object in the Namenode is a > linked list of block references for every DatanodeStorageInfo (called > "triplets"). > We propose to change the way we store the list in memory. > Using primitive integer indexes instead of object references will reduce the > memory needed for every block replica (when compressed oops is disabled) and > in our new design the list overhead will be per DatanodeStorageInfo and not > per block replica. > see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6658: - Target Version/s: (was: 2.8.0) > Namenode memory optimization - Block replicas list > --- > > Key: HDFS-6658 > URL: https://issues.apache.org/jira/browse/HDFS-6658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Daryn Sharp > Attachments: BlockListOptimizationComparison.xlsx, BlocksMap > redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode > Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, > Old triplets.jpg > > > Part of the memory consumed by every BlockInfo object in the Namenode is a > linked list of block references for every DatanodeStorageInfo (called > "triplets"). > We propose to change the way we store the list in memory. > Using primitive integer indexes instead of object references will reduce the > memory needed for every block replica (when compressed oops is disabled) and > in our new design the list overhead will be per DatanodeStorageInfo and not > per block replica. > see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8538) Change the default volume choosing policy to AvailableSpaceVolumeChoosingPolicy
[ https://issues.apache.org/jira/browse/HDFS-8538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8538: - Target Version/s: 2.9.0 (was: 2.8.0) > Change the default volume choosing policy to > AvailableSpaceVolumeChoosingPolicy > --- > > Key: HDFS-8538 > URL: https://issues.apache.org/jira/browse/HDFS-8538 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-8538.001.patch > > > For datanodes with different sized disks, they almost always want the > available space policy. Users with homogenous disks are unaffected. > Since this code has baked for a while, let's change it to be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8344: - Target Version/s: 2.9.0 (was: 2.8.0) > NameNode doesn't recover lease for files with missing blocks > > > Key: HDFS-8344 > URL: https://issues.apache.org/jira/browse/HDFS-8344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, > HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, > HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch, > HDFS-8344.09.patch, HDFS-8344.10.patch, TestHadoop.java > > > I found another\(?) instance in which the lease is not recovered. This is > reproducible easily on a pseudo-distributed single node cluster > # Before you start it helps if you set. This is not necessary, but simply > reduces how long you have to wait > {code} > public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; > public static final long LEASE_HARDLIMIT_PERIOD = 2 * > LEASE_SOFTLIMIT_PERIOD; > {code} > # Client starts to write a file. (could be less than 1 block, but it hflushed > so some of the data has landed on the datanodes) (I'm copying the client code > I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) > # Client crashes. (I simulate this by kill -9 the $(hadoop jar > TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter" > # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was > only 1) > I believe the lease should be recovered and the block should be marked > missing. However this is not happening. The lease is never recovered. > The effect of this bug for us was that nodes could not be decommissioned > cleanly. Although we knew that the client had crashed, the Namenode never > released the leases (even after restarting the Namenode) (even months > afterwards). There are actually several other cases too where we don't > consider what happens if ALL the datanodes die while the file is being > written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805900#comment-15805900 ] Junping Du commented on HDFS-8344: -- Move it to 2.9 as 2.8 RC is in progress. > NameNode doesn't recover lease for files with missing blocks > > > Key: HDFS-8344 > URL: https://issues.apache.org/jira/browse/HDFS-8344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, > HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, > HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch, > HDFS-8344.09.patch, HDFS-8344.10.patch, TestHadoop.java > > > I found another\(?) instance in which the lease is not recovered. This is > reproducible easily on a pseudo-distributed single node cluster > # Before you start it helps if you set. This is not necessary, but simply > reduces how long you have to wait > {code} > public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; > public static final long LEASE_HARDLIMIT_PERIOD = 2 * > LEASE_SOFTLIMIT_PERIOD; > {code} > # Client starts to write a file. (could be less than 1 block, but it hflushed > so some of the data has landed on the datanodes) (I'm copying the client code > I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) > # Client crashes. (I simulate this by kill -9 the $(hadoop jar > TestHadoop.jar) process after it has printed "Wrote to the bufferedWriter" > # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was > only 1) > I believe the lease should be recovered and the block should be marked > missing. However this is not happening. The lease is never recovered. > The effect of this bug for us was that nodes could not be decommissioned > cleanly. Although we knew that the client had crashed, the Namenode never > released the leases (even after restarting the Namenode) (even months > afterwards). There are actually several other cases too where we don't > consider what happens if ALL the datanodes die while the file is being > written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8088) Reduce the number of HTrace spans generated by HDFS reads
[ https://issues.apache.org/jira/browse/HDFS-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8088: - Labels: (was: BB2015-05-TBR) > Reduce the number of HTrace spans generated by HDFS reads > - > > Key: HDFS-8088 > URL: https://issues.apache.org/jira/browse/HDFS-8088 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: HDFS-8088.001.patch > > > HDFS generates too many trace spans on read right now. Every call to read() > we make generates its own span, which is not very practical for things like > HBase or Accumulo that do many such reads as part of a single operation. > Instead of tracing every call to read(), we should only trace the cases where > we refill the buffer inside a BlockReader. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
[ https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7527: - Target Version/s: (was: 2.8.0) > TestDecommission.testIncludeByRegistrationName fails occassionally in trunk > --- > > Key: HDFS-7527 > URL: https://issues.apache.org/jira/browse/HDFS-7527 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, test >Reporter: Yongjun Zhang >Assignee: Binglin Chang > Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/ > {quote} > Error Message > test timed out after 36 milliseconds > Stacktrace > java.lang.Exception: test timed out after 36 milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957) > 2014-12-15 12:00:19,958 ERROR datanode.DataNode > (BPServiceActor.java:run(836)) - Initialization failed for Block pool > BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to > localhost/127.0.0.1:40565 Datanode denied communication with namenode because > the host is not in the include-list: DatanodeRegistration(127.0.0.1, > datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, > infoSecurePort=0, ipcPort=43726, > storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121) > 2014-12-15 12:00:29,087 FATAL datanode.DataNode > (BPServiceActor.java:run(841)) - Initialization failed for Block pool > BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to > localhost/127.0.0.1:40565. Exiting. > java.io.IOException: DN shut down before block pool connected > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) > at java.lang.Thread.run(Thread.java:745) > {quote} > Found by tool proposed in HADOOP-11045: > {quote} > [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j > Hadoop-Hdfs-trunk -n 5 | tee bt.log > Recently FAILED builds in url: > https://builds.apache.org//job/Hadoop-Hdfs-trunk > THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, > as listed below: > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport > (2014-12-15 03:30:01) > Failed test: > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName > Failed test: > org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect > Failed test: > org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport > (2014-12-13 10:32:27) > Failed test: > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport > (2014-12-13 03:30:01) > Failed test: > org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/testReport > (2014-12-11 03:30:01) > Failed test: > org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect > Failed test: >
[jira] [Updated] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
[ https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7527: - Labels: (was: BB2015-05-TBR) > TestDecommission.testIncludeByRegistrationName fails occassionally in trunk > --- > > Key: HDFS-7527 > URL: https://issues.apache.org/jira/browse/HDFS-7527 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, test >Reporter: Yongjun Zhang >Assignee: Binglin Chang > Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/ > {quote} > Error Message > test timed out after 36 milliseconds > Stacktrace > java.lang.Exception: test timed out after 36 milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957) > 2014-12-15 12:00:19,958 ERROR datanode.DataNode > (BPServiceActor.java:run(836)) - Initialization failed for Block pool > BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to > localhost/127.0.0.1:40565 Datanode denied communication with namenode because > the host is not in the include-list: DatanodeRegistration(127.0.0.1, > datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, > infoSecurePort=0, ipcPort=43726, > storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121) > 2014-12-15 12:00:29,087 FATAL datanode.DataNode > (BPServiceActor.java:run(841)) - Initialization failed for Block pool > BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to > localhost/127.0.0.1:40565. Exiting. > java.io.IOException: DN shut down before block pool connected > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) > at java.lang.Thread.run(Thread.java:745) > {quote} > Found by tool proposed in HADOOP-11045: > {quote} > [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j > Hadoop-Hdfs-trunk -n 5 | tee bt.log > Recently FAILED builds in url: > https://builds.apache.org//job/Hadoop-Hdfs-trunk > THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, > as listed below: > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport > (2014-12-15 03:30:01) > Failed test: > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName > Failed test: > org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect > Failed test: > org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport > (2014-12-13 10:32:27) > Failed test: > org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport > (2014-12-13 03:30:01) > Failed test: > org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline > ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/testReport > (2014-12-11 03:30:01) > Failed test: > org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect > Failed test: >
[jira] [Updated] (HDFS-7964) Add support for async edit logging
[ https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7964: - Target Version/s: 2.9.0 (was: 2.8.0) > Add support for async edit logging > -- > > Key: HDFS-7964 > URL: https://issues.apache.org/jira/browse/HDFS-7964 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.0.2-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-7964-branch-2.7.patch, > HDFS-7964-branch-2.8.0.patch, HDFS-7964-rebase.patch, HDFS-7964.patch, > HDFS-7964.patch, HDFS-7964.patch, HDFS-7964.patch > > > Edit logging is a major source of contention within the NN. LogEdit is > called within the namespace write log, while logSync is called outside of the > lock to allow greater concurrency. The handler thread remains busy until > logSync returns to provide the client with a durability guarantee for the > response. > Write heavy RPC load and/or slow IO causes handlers to stall in logSync. > Although the write lock is not held, readers are limited/starved and the call > queue fills. Combining an edit log thread with postponed RPC responses from > HADOOP-10300 will provide the same durability guarantee but immediately free > up the handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8643) Add snapshot names list to SnapshottableDirectoryStatus
[ https://issues.apache.org/jira/browse/HDFS-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8643: - Target Version/s: (was: 2.8.0) > Add snapshot names list to SnapshottableDirectoryStatus > --- > > Key: HDFS-8643 > URL: https://issues.apache.org/jira/browse/HDFS-8643 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8643-00.patch, HDFS-8643-01.patch > > > The idea of this jira to enhance {{SnapshottableDirectoryStatus}} by adding > {{snapshotNames}} attribute into it, presently it has the {{snapshotNumber}}. > IMHO this would help the users to get the list of snapshot names created. > Also, the snapshot names can be used while renaming or deleting the snapshots. > {code} > org.apache.hadoop.hdfs.protocol.SnapshottableDirectoryStatus.java > /** >* @return Snapshot names for the directory. >*/ > public List getSnapshotNames() { > return snapshotNames; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7964) Add support for async edit logging
[ https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805898#comment-15805898 ] Junping Du commented on HDFS-7964: -- move to 2.9 as 2.8 RC is in progress. > Add support for async edit logging > -- > > Key: HDFS-7964 > URL: https://issues.apache.org/jira/browse/HDFS-7964 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.0.2-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-7964-branch-2.7.patch, > HDFS-7964-branch-2.8.0.patch, HDFS-7964-rebase.patch, HDFS-7964.patch, > HDFS-7964.patch, HDFS-7964.patch, HDFS-7964.patch > > > Edit logging is a major source of contention within the NN. LogEdit is > called within the namespace write log, while logSync is called outside of the > lock to allow greater concurrency. The handler thread remains busy until > logSync returns to provide the client with a durability guarantee for the > response. > Write heavy RPC load and/or slow IO causes handlers to stall in logSync. > Although the write lock is not held, readers are limited/starved and the call > queue fills. Combining an edit log thread with postponed RPC responses from > HADOOP-10300 will provide the same durability guarantee but immediately free > up the handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6526: - Labels: (was: BB2015-05-TBR) > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6526.1.patch > > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5875) Add iterator support to INodesInPath
[ https://issues.apache.org/jira/browse/HDFS-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5875: - Target Version/s: (was: 2.8.0) > Add iterator support to INodesInPath > > > Key: HDFS-5875 > URL: https://issues.apache.org/jira/browse/HDFS-5875 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.0.0-alpha, 3.0.0-alpha1 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-5875.patch > > > "Resolve as you go" inode iteration will help with the implementation of > alternative locking schemes. It will also be the pre-cursor for resolving > paths once and only once per operation, as opposed to ~3 times/call. > This is an incremental and compatible change for IIP that does not break > existing callers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-5875) Add iterator support to INodesInPath
[ https://issues.apache.org/jira/browse/HDFS-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5875: - Labels: (was: BB2015-05-TBR) > Add iterator support to INodesInPath > > > Key: HDFS-5875 > URL: https://issues.apache.org/jira/browse/HDFS-5875 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.0.0-alpha, 3.0.0-alpha1 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-5875.patch > > > "Resolve as you go" inode iteration will help with the implementation of > alternative locking schemes. It will also be the pre-cursor for resolving > paths once and only once per operation, as opposed to ~3 times/call. > This is an incremental and compatible change for IIP that does not break > existing callers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6526) Implement HDFS TtlManager
[ https://issues.apache.org/jira/browse/HDFS-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6526: - Target Version/s: (was: 2.8.0) > Implement HDFS TtlManager > - > > Key: HDFS-6526 > URL: https://issues.apache.org/jira/browse/HDFS-6526 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6526.1.patch > > > This issue is used to track development of HDFS TtlManager, for details see > HDFS-6382. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9011: - Target Version/s: (was: 2.8.0) > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8959) Provide an iterator-based API for listing all the snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8959: - Target Version/s: (was: 2.8.0) > Provide an iterator-based API for listing all the snapshottable directories > --- > > Key: HDFS-8959 > URL: https://issues.apache.org/jira/browse/HDFS-8959 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8959-00.patch, HDFS-8959-01.patch, > HDFS-8959-02.patch > > > Presently {{DistributedFileSystem#getSnapshottableDirListing()}} is sending > all the {{SnapshottableDirectoryStatus[]}} array to the clients. Now the > client should have enough space to hold it in memory. There could be chance > that the client JVMs running out of memory because of this. Also, some time > back there was a > [comment|https://issues.apache.org/jira/browse/HDFS-8643?focusedCommentId=14658800=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14658800] > about RPC packet limitation and a large number of snapshot list can again > cause issues. > I believe iterator based {{DistributedFileSystem#listSnapshottableDirs()}} > API would be a good addition! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6525: - Target Version/s: (was: 2.8.0) > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch, HDFS-6525.2.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8944) Make dfsadmin command options case insensitive
[ https://issues.apache.org/jira/browse/HDFS-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8944: - Target Version/s: (was: 2.8.0) > Make dfsadmin command options case insensitive > -- > > Key: HDFS-8944 > URL: https://issues.apache.org/jira/browse/HDFS-8944 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-8944.001.patch, HDFS-8944.002.patch > > > Now dfsadmin command options are case sensitive except allowSnapshot and > disallowSnapshot. It would be better to make them case insensitive for > usability and consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6525) FsShell supports HDFS TTL
[ https://issues.apache.org/jira/browse/HDFS-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6525: - Labels: (was: BB2015-05-TBR) > FsShell supports HDFS TTL > - > > Key: HDFS-6525 > URL: https://issues.apache.org/jira/browse/HDFS-6525 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, tools >Affects Versions: 2.4.0 >Reporter: Zesheng Wu >Assignee: Zesheng Wu > Attachments: HDFS-6525.1.patch, HDFS-6525.2.patch > > > This issue is used to track development of supporting HDFS TTL for FsShell, > for details see HDFS-6382. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8976) Create HTML5 cluster webconsole for federated cluster
[ https://issues.apache.org/jira/browse/HDFS-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8976: - Target Version/s: (was: 2.8.0) > Create HTML5 cluster webconsole for federated cluster > - > > Key: HDFS-8976 > URL: https://issues.apache.org/jira/browse/HDFS-8976 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.7.0 >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-8976-01.patch, HDFS-8976-02.patch, > HDFS-8976-03.patch, cluster-health.JPG > > > Since the old jsp variant of cluster web console is no longer present from > 2.7 onwards, there is a need for HTML 5 web console for overview of overall > cluster. > 2.7.1 docs says to check webconsole as below {noformat}Similar to the > Namenode status web page, when using federation a Cluster Web Console is > available to monitor the federated cluster at > http:///dfsclusterhealth.jsp. Any Namenode in the cluster > can be used to access this web page.{noformat} > But this is no longer present, -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8817) Make StorageType for Volumes in DataNode visible through JMX
[ https://issues.apache.org/jira/browse/HDFS-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8817: - Target Version/s: 2.9.0 (was: 2.8.0) > Make StorageType for Volumes in DataNode visible through JMX > > > Key: HDFS-8817 > URL: https://issues.apache.org/jira/browse/HDFS-8817 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.8.0 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HDFS-8817.001.patch > > > StorageTypes are part of Volumes on DataNodes. Right now VolumeInfo does not > contain the StorageType Info in the {{VolumeInfo}}. This JIRA proposes to > expose that info through VolumeInfo JSON. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9119) Discrepancy between edit log tailing interval and RPC timeout for transitionToActive
[ https://issues.apache.org/jira/browse/HDFS-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9119: - Target Version/s: (was: 2.8.0) > Discrepancy between edit log tailing interval and RPC timeout for > transitionToActive > > > Key: HDFS-9119 > URL: https://issues.apache.org/jira/browse/HDFS-9119 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9119.00.patch > > > {{EditLogTailer}} on standby NameNode tails edits from active NameNode every > 2 minutes. But the {{transitionToActive}} RPC call has a timeout of 1 minute. > If active NameNode encounters very intensive metadata workload (in > particular, a lot of {{AddOp}} and {{MkDir}} operations to create new files > and directories), the amount of updates accumulated in the 2 mins edit log > tailing interval is hard for the standby NameNode to catch up in the 1 min > timeout window. If that happens, the FailoverController will timeout and give > up trying to transition the standby to active. The old ANN will resume adding > more edits. When the SbNN finally finishes catching up the edits and tries to > become active, it will crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6939) Support path-based filtering of inotify events
[ https://issues.apache.org/jira/browse/HDFS-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6939: - Target Version/s: (was: 2.8.0) > Support path-based filtering of inotify events > -- > > Key: HDFS-6939 > URL: https://issues.apache.org/jira/browse/HDFS-6939 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode, qjm >Reporter: James Thomas >Assignee: Surendra Singh Lilhore > Attachments: HDFS-6939-001.patch > > > Users should be able to specify that they only want events involving > particular paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6853) MiniDFSCluster.isClusterUp() should not check if null NameNodes are up
[ https://issues.apache.org/jira/browse/HDFS-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6853: - Labels: (was: BB2015-05-TBR) > MiniDFSCluster.isClusterUp() should not check if null NameNodes are up > -- > > Key: HDFS-6853 > URL: https://issues.apache.org/jira/browse/HDFS-6853 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: James Thomas >Assignee: James Thomas > Attachments: HDFS-6853.2.patch, HDFS-6853.patch > > > Suppose we have a two-NN cluster and then shut down one of the NN's (NN0). > When we try to restart the other NN (NN1), we wait for isClusterUp() to > return true, but this will never happen because NN0 is null and isNameNodeUp > returns false for a null NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6914: - Labels: hdfs (was: BB2015-05-TBR hdfs) > Resolve huge memory consumption Issue with OIV processing PB-based fsimages > --- > > Key: HDFS-6914 > URL: https://issues.apache.org/jira/browse/HDFS-6914 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Hao Chen > Labels: hdfs > Attachments: HDFS-6914.patch, HDFS-6914.v2.patch > > > For better managing and supporting a lot of large hadoop clusters in > production, we internally need to automatically export fsimage to delimited > text files in LSR style and then analyse with hive or pig or build system > metrics for real-time analyzing. > However due to the internal layout changes introduced by the protobuf-based > fsimage, OIV processing program consumes excessive amount of memory. For > example, in order to export the fsimage in size of 8GB, it should have taken > about 85GB memory which is really not reasonable and impacted performance of > other services badly in the same server. > To resolve above problem, I submit this patch which will reduce memory > consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6853) MiniDFSCluster.isClusterUp() should not check if null NameNodes are up
[ https://issues.apache.org/jira/browse/HDFS-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6853: - Target Version/s: (was: 2.8.0) > MiniDFSCluster.isClusterUp() should not check if null NameNodes are up > -- > > Key: HDFS-6853 > URL: https://issues.apache.org/jira/browse/HDFS-6853 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: James Thomas >Assignee: James Thomas > Attachments: HDFS-6853.2.patch, HDFS-6853.patch > > > Suppose we have a two-NN cluster and then shut down one of the NN's (NN0). > When we try to restart the other NN (NN1), we wait for isClusterUp() to > return true, but this will never happen because NN0 is null and isNameNodeUp > returns false for a null NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9586) listCorruptFileBlocks should not output files that all replications are decommissioning
[ https://issues.apache.org/jira/browse/HDFS-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9586: - Target Version/s: (was: 2.8.0) > listCorruptFileBlocks should not output files that all replications are > decommissioning > --- > > Key: HDFS-9586 > URL: https://issues.apache.org/jira/browse/HDFS-9586 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9586-v1.patch > > > As HDFS-7933 said, we should count decommissioning and decommissioned nodes > respectively and regard decommissioning nodes as special live nodes whose > file is not corrupt or missing. > So in listCorruptFileBlocks which is used by fsck and HDFS namenode website, > we should collect a corrupt file only if liveReplicas and decommissioning are > both 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6914) Resolve huge memory consumption Issue with OIV processing PB-based fsimages
[ https://issues.apache.org/jira/browse/HDFS-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6914: - Target Version/s: (was: 2.8.0) > Resolve huge memory consumption Issue with OIV processing PB-based fsimages > --- > > Key: HDFS-6914 > URL: https://issues.apache.org/jira/browse/HDFS-6914 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Hao Chen > Labels: hdfs > Attachments: HDFS-6914.patch, HDFS-6914.v2.patch > > > For better managing and supporting a lot of large hadoop clusters in > production, we internally need to automatically export fsimage to delimited > text files in LSR style and then analyse with hive or pig or build system > metrics for real-time analyzing. > However due to the internal layout changes introduced by the protobuf-based > fsimage, OIV processing program consumes excessive amount of memory. For > example, in order to export the fsimage in size of 8GB, it should have taken > about 85GB memory which is really not reasonable and impacted performance of > other services badly in the same server. > To resolve above problem, I submit this patch which will reduce memory > consumption of OIV LSR processing by 50%. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7168) Use excludedNodes consistently in DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7168: - Labels: (was: BB2015-05-TBR) > Use excludedNodes consistently in DFSOutputStream > - > > Key: HDFS-7168 > URL: https://issues.apache.org/jira/browse/HDFS-7168 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: HDFS-7168.001.patch > > > We currently have two separate collections of excluded nodes in the > {{DFSOutputStream#DataStreamer}}. One is {{DFSOutputStream#failed}}; another > is {{DFSOutputStream#excludedNodes}}. Both of these collections just deal > with blacklisting nodes that we have found to be bad. We should just use > excludedNodes for both. > We also should make this a per-DFSOutputStream variable, rather than being > per-DataStreamer. We don't need to forget all this information whenever a > DataStreamer is torn down. Since {{DFSOutputStream#excludedNodes}} is a > Guava cache, nodes will expire out of it once enough time elapses, so they > will not be permanently blacklisted. > We should also remove {{DFSOutputStream#setTestFilename}}, since it is no > longer needed now that we can safely rename streams that are open for write. > And {{DFSOutputStream#getBlock}} should be synchronized. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6660: - Labels: performance (was: BB2015-05-TBR performance) > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6659) Create a Block List
[ https://issues.apache.org/jira/browse/HDFS-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6659: - Labels: perfomance (was: BB2015-05-TBR perfomance) > Create a Block List > --- > > Key: HDFS-6659 > URL: https://issues.apache.org/jira/browse/HDFS-6659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: perfomance > Attachments: HDFS-6659.patch > > > BlockList - An efficient array based list that can extend its capacity with > two main features: > 1. Gaps (result of remove operations) are managed internally without the need > for extra memory - We create a linked list of gaps by using the array index > as references + An int to the head of the gaps list. In every insert > operation, we first use any available gap before extending the array. > 2. Array extension is done by chaining different arrays, not by allocating a > larger array and copying all its data across. This is a lot less heavy in > terms of latency for that particular call. It also avoids having large amount > of contiguous heap space and so behaves nicer with garbage collection. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7168) Use excludedNodes consistently in DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7168: - Target Version/s: (was: 2.8.0) > Use excludedNodes consistently in DFSOutputStream > - > > Key: HDFS-7168 > URL: https://issues.apache.org/jira/browse/HDFS-7168 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: HDFS-7168.001.patch > > > We currently have two separate collections of excluded nodes in the > {{DFSOutputStream#DataStreamer}}. One is {{DFSOutputStream#failed}}; another > is {{DFSOutputStream#excludedNodes}}. Both of these collections just deal > with blacklisting nodes that we have found to be bad. We should just use > excludedNodes for both. > We also should make this a per-DFSOutputStream variable, rather than being > per-DataStreamer. We don't need to forget all this information whenever a > DataStreamer is torn down. Since {{DFSOutputStream#excludedNodes}} is a > Guava cache, nodes will expire out of it once enough time elapses, so they > will not be permanently blacklisted. > We should also remove {{DFSOutputStream#setTestFilename}}, since it is no > longer needed now that we can safely rename streams that are open for write. > And {{DFSOutputStream#getBlock}} should be synchronized. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6782) Improve FS editlog logSync
[ https://issues.apache.org/jira/browse/HDFS-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6782: - Labels: (was: BB2015-05-TBR) > Improve FS editlog logSync > -- > > Key: HDFS-6782 > URL: https://issues.apache.org/jira/browse/HDFS-6782 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.4.1 >Reporter: Yi Liu >Assignee: Yi Liu > Attachments: HDFS-6782.001.patch, HDFS-6782.002.patch > > > In NN, it uses a double buffer (bufCurrent, bufReady) for log sync, > bufCurrent it to buffer new coming edit ops and bufReady is for flushing. > This's efficient. When flush is ongoing, and bufCurrent is full, NN goes to > force log sync, and all new Ops are blocked (since force log sync is > protected by FSNameSystem write lock). After the flush finished, the new Ops > are still blocked, but actually at this time, bufCurrent is free and Ops can > go ahead and write to the buffer. The following diagram shows the detail. > This JIRA is for this improvement. Thanks [~umamaheswararao] for confirming > this issue. > {code} > edit1(txid1) -- write to bufCurrent logSync - (swap > buffer)flushing --- > edit2(txid2) -- write to bufCurrent logSync - waiting > --- > edit3(txid3) -- write to bufCurrent logSync - waiting > --- > edit4(txid4) -- write to bufCurrent logSync - waiting > --- > edit5(txid5) -- write to bufCurrent --full-- force sync - waiting > --- > edit6(txid6) -- blocked > ... > editn(txidn) -- blocked > {code} > After the flush, it becomes > {code} > edit1(txid1) -- write to bufCurrent logSync - finished > > edit2(txid2) -- write to bufCurrent logSync - flushing > --- > edit3(txid3) -- write to bufCurrent logSync - waiting > --- > edit4(txid4) -- write to bufCurrent logSync - waiting > --- > edit5(txid5) -- write to bufCurrent --full-- force sync - waiting > --- > edit6(txid6) -- blocked > ... > editn(txidn) -- blocked > {code} > After edit1 finished, bufCurrent is free, and the thread which flushes txid2 > will also flushes txid3-5, so we should return from the force sync of edit5 > and FSNamesystem write lock will be freed (Don't worry that edit5 Op will > return, since there will be a normal logSync after the force logSync and > there will wait for sync finished). This is the idea of this JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6659) Create a Block List
[ https://issues.apache.org/jira/browse/HDFS-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6659: - Target Version/s: (was: 2.8.0) > Create a Block List > --- > > Key: HDFS-6659 > URL: https://issues.apache.org/jira/browse/HDFS-6659 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: perfomance > Attachments: HDFS-6659.patch > > > BlockList - An efficient array based list that can extend its capacity with > two main features: > 1. Gaps (result of remove operations) are managed internally without the need > for extra memory - We create a linked list of gaps by using the array index > as references + An int to the head of the gaps list. In every insert > operation, we first use any available gap before extending the array. > 2. Array extension is done by chaining different arrays, not by allocating a > larger array and copying all its data across. This is a lot less heavy in > terms of latency for that particular call. It also avoids having large amount > of contiguous heap space and so behaves nicer with garbage collection. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6782) Improve FS editlog logSync
[ https://issues.apache.org/jira/browse/HDFS-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6782: - Target Version/s: (was: 2.8.0) > Improve FS editlog logSync > -- > > Key: HDFS-6782 > URL: https://issues.apache.org/jira/browse/HDFS-6782 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.4.1 >Reporter: Yi Liu >Assignee: Yi Liu > Attachments: HDFS-6782.001.patch, HDFS-6782.002.patch > > > In NN, it uses a double buffer (bufCurrent, bufReady) for log sync, > bufCurrent it to buffer new coming edit ops and bufReady is for flushing. > This's efficient. When flush is ongoing, and bufCurrent is full, NN goes to > force log sync, and all new Ops are blocked (since force log sync is > protected by FSNameSystem write lock). After the flush finished, the new Ops > are still blocked, but actually at this time, bufCurrent is free and Ops can > go ahead and write to the buffer. The following diagram shows the detail. > This JIRA is for this improvement. Thanks [~umamaheswararao] for confirming > this issue. > {code} > edit1(txid1) -- write to bufCurrent logSync - (swap > buffer)flushing --- > edit2(txid2) -- write to bufCurrent logSync - waiting > --- > edit3(txid3) -- write to bufCurrent logSync - waiting > --- > edit4(txid4) -- write to bufCurrent logSync - waiting > --- > edit5(txid5) -- write to bufCurrent --full-- force sync - waiting > --- > edit6(txid6) -- blocked > ... > editn(txidn) -- blocked > {code} > After the flush, it becomes > {code} > edit1(txid1) -- write to bufCurrent logSync - finished > > edit2(txid2) -- write to bufCurrent logSync - flushing > --- > edit3(txid3) -- write to bufCurrent logSync - waiting > --- > edit4(txid4) -- write to bufCurrent logSync - waiting > --- > edit5(txid5) -- write to bufCurrent --full-- force sync - waiting > --- > edit6(txid6) -- blocked > ... > editn(txidn) -- blocked > {code} > After edit1 finished, bufCurrent is free, and the thread which flushes txid2 > will also flushes txid3-5, so we should return from the force sync of edit5 > and FSNamesystem write lock will be freed (Don't worry that edit5 Op will > return, since there will be a normal logSync after the force logSync and > there will wait for sync finished). This is the idea of this JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10480) Add an admin command to list currently open files
[ https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805860#comment-15805860 ] Junping Du commented on HDFS-10480: --- 2.8 is in RC stage, move to 2.9 > Add an admin command to list currently open files > - > > Key: HDFS-10480 > URL: https://issues.apache.org/jira/browse/HDFS-10480 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Rushabh S Shah > Attachments: HDFS-10480-trunk-1.patch, HDFS-10480-trunk.patch > > > Currently there is no easy way to obtain the list of active leases or files > being written. It will be nice if we have an admin command to list open files > and their lease holders. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6660) Use int instead of object reference to DatanodeStorageInfo in BlockInfo triplets,
[ https://issues.apache.org/jira/browse/HDFS-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6660: - Target Version/s: (was: 2.8.0) > Use int instead of object reference to DatanodeStorageInfo in BlockInfo > triplets, > - > > Key: HDFS-6660 > URL: https://issues.apache.org/jira/browse/HDFS-6660 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.4.1 >Reporter: Amir Langer >Assignee: Amir Langer > Labels: performance > Attachments: HDFS-6660.patch > > > Map an int index to every DatanodeStorageInfo and use it instead of object > reference in the BlockInfo triplets data structure. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6450: - Target Version/s: (was: 2.8.0) > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6308) TestDistributedFileSystem#testGetFileBlockStorageLocationsError is flaky
[ https://issues.apache.org/jira/browse/HDFS-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6308: - Target Version/s: (was: 2.8.0) > TestDistributedFileSystem#testGetFileBlockStorageLocationsError is flaky > > > Key: HDFS-6308 > URL: https://issues.apache.org/jira/browse/HDFS-6308 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-6308.v1.patch > > > Found this on pre-commit build of HDFS-6261 > {code} > java.lang.AssertionError: Expected one valid and one invalid volume > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.hdfs.TestDistributedFileSystem.testGetFileBlockStorageLocationsError(TestDistributedFileSystem.java:837) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6310) PBImageXmlWriter should output information about Delegation Tokens
[ https://issues.apache.org/jira/browse/HDFS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6310: - Target Version/s: (was: 2.8.0) > PBImageXmlWriter should output information about Delegation Tokens > -- > > Key: HDFS-6310 > URL: https://issues.apache.org/jira/browse/HDFS-6310 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.4.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-6310.patch > > > Separated from HDFS-6293. > The 2.4.0 pb-fsimage does contain tokens, but OfflineImageViewer with -XML > option does not show any tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6308) TestDistributedFileSystem#testGetFileBlockStorageLocationsError is flaky
[ https://issues.apache.org/jira/browse/HDFS-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6308: - Labels: (was: BB2015-05-TBR) > TestDistributedFileSystem#testGetFileBlockStorageLocationsError is flaky > > > Key: HDFS-6308 > URL: https://issues.apache.org/jira/browse/HDFS-6308 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Binglin Chang >Assignee: Binglin Chang > Attachments: HDFS-6308.v1.patch > > > Found this on pre-commit build of HDFS-6261 > {code} > java.lang.AssertionError: Expected one valid and one invalid volume > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.hdfs.TestDistributedFileSystem.testGetFileBlockStorageLocationsError(TestDistributedFileSystem.java:837) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6450) Support non-positional hedged reads in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6450: - Labels: (was: BB2015-05-TBR) > Support non-positional hedged reads in HDFS > --- > > Key: HDFS-6450 > URL: https://issues.apache.org/jira/browse/HDFS-6450 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Liang Xie > Attachments: HDFS-6450-like-pread.txt > > > HDFS-5776 added support for hedged positional reads. We should also support > hedged non-position reads (aka regular reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6813: - Labels: (was: BB2015-05-TBR) > WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable > with thead-safe. > --- > > Key: HDFS-6813 > URL: https://issues.apache.org/jira/browse/HDFS-6813 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Yi Liu >Assignee: Yi Liu > Attachments: HDFS-6813.001.patch > > > {{PositionedReadable}} definition requires the implementations for its > interfaces should be thread-safe. > OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these > interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
[ https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6515: - Target Version/s: 2.9.0 (was: 2.8.0) > testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > - > > Key: HDFS-6515 > URL: https://issues.apache.org/jira/browse/HDFS-6515 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0, 2.4.1, 3.0.0-alpha1 > Environment: Linux on PPC64 > Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora > 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test >Reporter: Tony Reix > Labels: BB2015-05-TBR, hadoop, test > Attachments: HDFS-6515-1.patch, HDFS-6515-2.patch > > > I have an issue with test : >testPageRounder > (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > on Linux/PowerPC. > On Linux/Intel, test runs fine. > On Linux/PowerPC, I have: > testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) > Time elapsed: 64.037 sec <<< ERROR! > java.lang.Exception: test timed out after 6 milliseconds > Looking at details, I see that some "Failed to cache " messages appear in the > traces. Only 10 on Intel, but 186 on PPC64. > On PPC64, it looks like some thread is waiting for something that never > happens, generating a TimeOut. > I'm now using IBM JVM, however I've just checked that the issue also appears > with OpenJDK. > I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 . > I need help for understanding what the test is doing, what traces are > expected, in order to understand what/where is the root cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8068) Do not retry rpc calls If the proxy contains unresolved address
[ https://issues.apache.org/jira/browse/HDFS-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8068: - Target Version/s: (was: 2.8.0) > Do not retry rpc calls If the proxy contains unresolved address > --- > > Key: HDFS-8068 > URL: https://issues.apache.org/jira/browse/HDFS-8068 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-8068.v1.patch, HDFS-8068.v2.patch > > > When the InetSocketAddress object happens to be unresolvable (e.g. due to > transient DNS issue), the rpc proxy object will not be usable since the > client will throw UnknownHostException when a Connection object is created. > If FailoverOnNetworkExceptionRetry is used as in the standard HA failover > proxy, the call will be retried, but this will never recover. Instead, the > validity of address must be checked on pxoy creation and throw if it is > invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6813) WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable with thead-safe.
[ https://issues.apache.org/jira/browse/HDFS-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6813: - Target Version/s: (was: 2.8.0) > WebHdfsFileSystem#OffsetUrlInputStream should implement PositionedReadable > with thead-safe. > --- > > Key: HDFS-6813 > URL: https://issues.apache.org/jira/browse/HDFS-6813 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Yi Liu >Assignee: Yi Liu > Attachments: HDFS-6813.001.patch > > > {{PositionedReadable}} definition requires the implementations for its > interfaces should be thread-safe. > OffsetUrlInputStream(WebHdfsFileSystem inputstream) doesn't implement these > interfaces with tread-safe, this JIRA is to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6310) PBImageXmlWriter should output information about Delegation Tokens
[ https://issues.apache.org/jira/browse/HDFS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-6310: - Labels: (was: BB2015-05-TBR) > PBImageXmlWriter should output information about Delegation Tokens > -- > > Key: HDFS-6310 > URL: https://issues.apache.org/jira/browse/HDFS-6310 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.4.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-6310.patch > > > Separated from HDFS-6293. > The 2.4.0 pb-fsimage does contain tokens, but OfflineImageViewer with -XML > option does not show any tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8697) Refactor DecommissionManager: more generic method names and misc cleanup
[ https://issues.apache.org/jira/browse/HDFS-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8697: - Target Version/s: (was: 2.8.0) > Refactor DecommissionManager: more generic method names and misc cleanup > > > Key: HDFS-8697 > URL: https://issues.apache.org/jira/browse/HDFS-8697 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Affects Versions: 2.7.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-8697.00.patch, HDFS-8697.01.patch > > > This JIRA merges the changes in {{DecommissionManager}} from the HDFS-7285 > branch, including changing a few method names to be more generic > ({{replicated}} -> {{stored}}), and some cleanups. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8508) TestJournalNode fails occasionally with bind exception
[ https://issues.apache.org/jira/browse/HDFS-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8508: - Target Version/s: (was: 2.8.0) > TestJournalNode fails occasionally with bind exception > -- > > Key: HDFS-8508 > URL: https://issues.apache.org/jira/browse/HDFS-8508 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.0 >Reporter: Arpit Agarwal >Assignee: Takashi Ohnishi > Attachments: HDFS-8508.1.patch > > > TestJournalNode uses the default port for {{dfs.journalnode.http-address}} so > it fails occasionally when running with {{-Pparallel-tests}}. > The same issue likely exists with other tests. Tests should generate a random > port number with retry to reduce the chances of a collision. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8068) Do not retry rpc calls If the proxy contains unresolved address
[ https://issues.apache.org/jira/browse/HDFS-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8068: - Labels: (was: BB2015-05-TBR) > Do not retry rpc calls If the proxy contains unresolved address > --- > > Key: HDFS-8068 > URL: https://issues.apache.org/jira/browse/HDFS-8068 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-8068.v1.patch, HDFS-8068.v2.patch > > > When the InetSocketAddress object happens to be unresolvable (e.g. due to > transient DNS issue), the rpc proxy object will not be usable since the > client will throw UnknownHostException when a Connection object is created. > If FailoverOnNetworkExceptionRetry is used as in the standard HA failover > proxy, the call will be retried, but this will never recover. Instead, the > validity of address must be checked on pxoy creation and throw if it is > invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9341) Simulated slow disk in SimulatedFSDataset
[ https://issues.apache.org/jira/browse/HDFS-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9341: - Target Version/s: (was: 2.8.0) > Simulated slow disk in SimulatedFSDataset > - > > Key: HDFS-9341 > URL: https://issues.apache.org/jira/browse/HDFS-9341 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > Attachments: HDFS-9341.00.patch > > > Besides simulating the byte content, {{SimulatedFSDataset}} can also simulate > the scenario where disk is slow when accessing certain bytes. The slowness > can be random or controlled at certain bytes. It can also be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8516) The 'hdfs crypto -listZones' should not print an extra newline at end of output
[ https://issues.apache.org/jira/browse/HDFS-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8516: - Target Version/s: (was: 2.8.0) > The 'hdfs crypto -listZones' should not print an extra newline at end of > output > --- > > Key: HDFS-8516 > URL: https://issues.apache.org/jira/browse/HDFS-8516 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Affects Versions: 2.7.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Attachments: HDFS-8516.patch > > > It currently prints an extra newline (TableListing already adds a newline to > end of table string). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10620: -- Target Version/s: 2.9.0, 3.0.0-alpha2 (was: 2.8.0, 3.0.0-alpha2) > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10620-branch-2.01.patch, HDFS-10620.001.patch, > HDFS-10620.002.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX
[ https://issues.apache.org/jira/browse/HDFS-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8694: - Target Version/s: (was: 2.8.0) > Expose the stats of IOErrors on each FsVolume through JMX > - > > Key: HDFS-8694 > URL: https://issues.apache.org/jira/browse/HDFS-8694 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-8694.000.patch, HDFS-8694.001.patch > > > Currently, once DataNode hits an {{IOError}} when writing / reading block > files, it starts a background {{DiskChecker.checkDirs()}} thread. But if this > thread successfully finishes, DN does not record this {{IOError}}. > We need one measurement to count all {{IOErrors}} for each volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8088) Reduce the number of HTrace spans generated by HDFS reads
[ https://issues.apache.org/jira/browse/HDFS-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8088: - Target Version/s: (was: 2.8.0) > Reduce the number of HTrace spans generated by HDFS reads > - > > Key: HDFS-8088 > URL: https://issues.apache.org/jira/browse/HDFS-8088 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: HDFS-8088.001.patch > > > HDFS generates too many trace spans on read right now. Every call to read() > we make generates its own span, which is not very practical for things like > HBase or Accumulo that do many such reads as part of a single operation. > Instead of tracing every call to read(), we should only trace the cases where > we refill the buffer inside a BlockReader. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7142) Implement a 2Q eviction strategy for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7142: - Labels: (was: BB2015-05-TBR) > Implement a 2Q eviction strategy for HDFS-6581 > -- > > Key: HDFS-7142 > URL: https://issues.apache.org/jira/browse/HDFS-7142 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.7.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: 0002-Add-RamDiskReplica2QTracker.patch, > HDFS-7142.003.patch > > > We should implement a 2Q or approximate 2Q eviction strategy for HDFS-6581. > It is well known that LRU is a poor fit for scanning workloads, which HDFS > may often encounter. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7368) Support HDFS specific 'shell' on command 'hdfs dfs' invocation
[ https://issues.apache.org/jira/browse/HDFS-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7368: - Target Version/s: 2.9.0 (was: 2.8.0) > Support HDFS specific 'shell' on command 'hdfs dfs' invocation > -- > > Key: HDFS-7368 > URL: https://issues.apache.org/jira/browse/HDFS-7368 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-7368-001.patch > > > * *hadoop fs* is the generic implementation for all filesystem > implementations, but some of the operations are supported only in some > filesystems. Ex: snapshot commands, acl commands, xattr commands. > * *hdfs dfs* is recommended in all hdfs related docs in current releases. > In current code both *hdfs shell* and *hadoop fs* points to hadoop common > implementation of FSShell. > It would be better to have HDFS specific extention of FSShell which includes > HDFS only commands in future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9884) Use doxia macro to generate in-page TOC of HDFS site documentation
[ https://issues.apache.org/jira/browse/HDFS-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9884: - Target Version/s: (was: 2.8.0) > Use doxia macro to generate in-page TOC of HDFS site documentation > -- > > Key: HDFS-9884 > URL: https://issues.apache.org/jira/browse/HDFS-9884 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.0 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki > Attachments: HDFS-9884.001.patch, HDFS-9884.002.patch > > > Since maven-site-plugin 3.5 was released, we can use toc macro in Markdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7304) TestFileCreation#testOverwriteOpenForWrite hangs
[ https://issues.apache.org/jira/browse/HDFS-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7304: - Labels: (was: BB2015-05-TBR) > TestFileCreation#testOverwriteOpenForWrite hangs > > > Key: HDFS-7304 > URL: https://issues.apache.org/jira/browse/HDFS-7304 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Akira Ajisaka > Attachments: HDFS-7304.patch, HDFS-7304.patch > > > The test case times out. It has been observed in multiple pre-commit builds. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7142) Implement a 2Q eviction strategy for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7142: - Target Version/s: (was: 2.8.0) > Implement a 2Q eviction strategy for HDFS-6581 > -- > > Key: HDFS-7142 > URL: https://issues.apache.org/jira/browse/HDFS-7142 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.7.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: 0002-Add-RamDiskReplica2QTracker.patch, > HDFS-7142.003.patch > > > We should implement a 2Q or approximate 2Q eviction strategy for HDFS-6581. > It is well known that LRU is a poor fit for scanning workloads, which HDFS > may often encounter. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8469) Lockfiles are not being created for datanode storage directories
[ https://issues.apache.org/jira/browse/HDFS-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8469: - Target Version/s: (was: 2.8.0) > Lockfiles are not being created for datanode storage directories > > > Key: HDFS-8469 > URL: https://issues.apache.org/jira/browse/HDFS-8469 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Attachments: HDFS-8469.001.patch > > > Lockfiles are not being created for datanode storage directories. Due to a > mixup, we are initializing the StorageDirectory class with shared=true (an > option which was only intended for NFS directories used to implement NameNode > HA). Setting shared=true disables lockfile generation and prints a log > message like this: > {code} > 2015-05-22 11:45:16,367 INFO common.Storage (Storage.java:lock(675)) - > Locking is disabled for > /home/cmccabe/hadoop2/hadoop-hdfs-project/hadoop-hdfs/target/ > test/data/dfs/data/data5/current/BP-122766180-127.0.0.1-1432320314834 > {code} > Without lock files, we could accidentally spawn two datanode processes using > the same directories without realizing it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7368) Support HDFS specific 'shell' on command 'hdfs dfs' invocation
[ https://issues.apache.org/jira/browse/HDFS-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7368: - Labels: (was: BB2015-05-TBR) > Support HDFS specific 'shell' on command 'hdfs dfs' invocation > -- > > Key: HDFS-7368 > URL: https://issues.apache.org/jira/browse/HDFS-7368 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-7368-001.patch > > > * *hadoop fs* is the generic implementation for all filesystem > implementations, but some of the operations are supported only in some > filesystems. Ex: snapshot commands, acl commands, xattr commands. > * *hdfs dfs* is recommended in all hdfs related docs in current releases. > In current code both *hdfs shell* and *hadoop fs* points to hadoop common > implementation of FSShell. > It would be better to have HDFS specific extention of FSShell which includes > HDFS only commands in future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7304) TestFileCreation#testOverwriteOpenForWrite hangs
[ https://issues.apache.org/jira/browse/HDFS-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7304: - Target Version/s: 2.8.0, 2.9.0 (was: 2.8.0) > TestFileCreation#testOverwriteOpenForWrite hangs > > > Key: HDFS-7304 > URL: https://issues.apache.org/jira/browse/HDFS-7304 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Akira Ajisaka > Attachments: HDFS-7304.patch, HDFS-7304.patch > > > The test case times out. It has been observed in multiple pre-commit builds. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7408) Add a counter in the log that shows the number of block reports processed
[ https://issues.apache.org/jira/browse/HDFS-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7408: - Target Version/s: 2.9.0 (was: 2.8.0) > Add a counter in the log that shows the number of block reports processed > - > > Key: HDFS-7408 > URL: https://issues.apache.org/jira/browse/HDFS-7408 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Surendra Singh Lilhore > Attachments: HDFS-7408.001.patch > > > It would be great to have in the info log corresponding to block report > processing, printing information on how many block reports have been > processed. This can be useful to debug when namenode is unresponsive > especially during startup time to understand if datanodes are sending block > reports multiple times. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8115) Make PermissionStatusFormat public
[ https://issues.apache.org/jira/browse/HDFS-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8115: - Labels: (was: BB2015-05-TBR) > Make PermissionStatusFormat public > -- > > Key: HDFS-8115 > URL: https://issues.apache.org/jira/browse/HDFS-8115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Attachments: HDFS-8115.1.patch > > > implementations of {{INodeAttributeProvider}} are required to provide an > implementation of {{getPermissionLong()}} method. Unfortunately, the long > permission format is an encoding of the user, group and mode with each field > converted to int using {{SerialNumberManager}} which is package protected. > Thus it would be nice to make the {{PermissionStatusFormat}} enum public (and > also make the {{toLong()}} static method public) so that user specified > implementations of {{INodeAttributeProvider}} may use it. > This would also make it more consistent with {{AclStatusFormat}} which I > guess has been made public for the same reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8115) Make PermissionStatusFormat public
[ https://issues.apache.org/jira/browse/HDFS-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8115: - Target Version/s: 2.9.0 (was: 2.8.0) > Make PermissionStatusFormat public > -- > > Key: HDFS-8115 > URL: https://issues.apache.org/jira/browse/HDFS-8115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Attachments: HDFS-8115.1.patch > > > implementations of {{INodeAttributeProvider}} are required to provide an > implementation of {{getPermissionLong()}} method. Unfortunately, the long > permission format is an encoding of the user, group and mode with each field > converted to int using {{SerialNumberManager}} which is package protected. > Thus it would be nice to make the {{PermissionStatusFormat}} enum public (and > also make the {{toLong()}} static method public) so that user specified > implementations of {{INodeAttributeProvider}} may use it. > This would also make it more consistent with {{AclStatusFormat}} which I > guess has been made public for the same reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7550) Minor followon cleanups from HDFS-7543
[ https://issues.apache.org/jira/browse/HDFS-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7550: - Target Version/s: 2.9.0 (was: 2.8.0) > Minor followon cleanups from HDFS-7543 > -- > > Key: HDFS-7550 > URL: https://issues.apache.org/jira/browse/HDFS-7550 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Charles Lamb >Priority: Minor > Attachments: HDFS-7550.001.patch > > > The commit of HDFS-7543 crossed paths with these comments: > FSDirMkdirOp.java > in #mkdirs, you removed the final String srcArg = src. This should be left > in. Many IDEs will whine about making assignments to formal args and that's > why it was put in in the first place. > FSDirRenameOp.java > #renameToInt, dstIIP (and resultingStat) could benefit from final's. > FSDirXAttrOp.java > I'm not sure why you've moved the call to getINodesInPath4Write and > checkXAttrChangeAccess inside the writeLock. > FSDirStatAndListing.java > The javadoc for the @param src needs to be changed to reflect that it's an > INodesInPath, not a String. Nit: it might be better to rename the > INodesInPath arg from src to iip. > #getFileInfo4DotSnapshot is now unused since you in-lined it into > #getFileInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-4319) FSShell copyToLocal creates files with the executable bit set
[ https://issues.apache.org/jira/browse/HDFS-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-4319: - Target Version/s: 2.9.0 (was: 2.8.0) > FSShell copyToLocal creates files with the executable bit set > - > > Key: HDFS-4319 > URL: https://issues.apache.org/jira/browse/HDFS-4319 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.3-alpha >Reporter: Colin P. McCabe >Priority: Minor > > With the default value of {{fs.permissions.umask-mode}}, {{022}}, {{FSShell > copyToLocal}} creates files with the executable bit set. > If, on the other hand, you change {{fs.permissions.umask-mode}} to something > like {{133}}, you encounter a different problem. When you use > {{copyToLocal}} to create directories, they don't have the executable bit > set, meaning they do not have search permission. > Since HDFS doesn't allow the executable bit to be set on files, it seems > illogical to add it in when using {{copyToLocal}}. This is also a > regression, since branch 1 did not have this problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7550) Minor followon cleanups from HDFS-7543
[ https://issues.apache.org/jira/browse/HDFS-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-7550: - Labels: (was: BB2015-05-TBR) > Minor followon cleanups from HDFS-7543 > -- > > Key: HDFS-7550 > URL: https://issues.apache.org/jira/browse/HDFS-7550 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Charles Lamb >Priority: Minor > Attachments: HDFS-7550.001.patch > > > The commit of HDFS-7543 crossed paths with these comments: > FSDirMkdirOp.java > in #mkdirs, you removed the final String srcArg = src. This should be left > in. Many IDEs will whine about making assignments to formal args and that's > why it was put in in the first place. > FSDirRenameOp.java > #renameToInt, dstIIP (and resultingStat) could benefit from final's. > FSDirXAttrOp.java > I'm not sure why you've moved the call to getINodesInPath4Write and > checkXAttrChangeAccess inside the writeLock. > FSDirStatAndListing.java > The javadoc for the @param src needs to be changed to reflect that it's an > INodesInPath, not a String. Nit: it might be better to rename the > INodesInPath arg from src to iip. > #getFileInfo4DotSnapshot is now unused since you in-lined it into > #getFileInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3570: - Target Version/s: 2.9.0 (was: 2.8.0) > Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used > space > > > Key: HDFS-3570 > URL: https://issues.apache.org/jira/browse/HDFS-3570 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Akira Ajisaka >Priority: Minor > Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, > HDFS-3570.aash.1.patch > > > Report from a user here: > https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, > post archived at http://pastebin.com/eVFkk0A0 > This user had a specific DN that had a large non-DFS usage among > dfs.data.dirs, and very little DFS usage (which is computed against total > possible capacity). > Balancer apparently only looks at the usage, and ignores to consider that > non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a > DFS Usage report from DN is 8% only, its got a lot of free space to write > more blocks, when that isn't true as shown by the case of this user. It went > on scheduling writes to the DN to balance it out, but the DN simply can't > accept any more blocks as a result of its disks' state. > I think it would be better if we _computed_ the actual utilization based on > {{(100-(actual remaining space))/(capacity)}}, as opposed to the current > {{(dfs used)/(capacity)}}. Thoughts? > This isn't very critical, however, cause it is very rare to see DN space > being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10364) Log current node in reversexml tool when parse failed
[ https://issues.apache.org/jira/browse/HDFS-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10364: -- Target Version/s: 2.9.0 (was: 2.8.0) > Log current node in reversexml tool when parse failed > - > > Key: HDFS-10364 > URL: https://issues.apache.org/jira/browse/HDFS-10364 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-10364.01.patch > > > Sometimes we want to modify the xml before converting it. If some error > happened, it's hard to find out. Adding a line to tell where the failure is > would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer
[ https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805834#comment-15805834 ] Junping Du commented on HDFS-7967: -- Thanks for reporting the issue and delivering the fix, [~daryn]! Do we plan to fix java 7 compile issue and get it committed today? If not, let's move it to 2.9 as I am planning to kick off RC tonight. > Reduce the performance impact of the balancer > - > > Key: HDFS-7967 > URL: https://issues.apache.org/jira/browse/HDFS-7967 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch > > > The balancer needs to query for blocks to move from overly full DNs. The > block lookup is extremely inefficient. An iterator of the node's blocks is > created from the iterators of its storages' blocks. A random number is > chosen corresponding to how many blocks will be skipped via the iterator. > Each skip requires costly scanning of triplets. > The current design also only considers node imbalances while ignoring > imbalances within the nodes's storages. A more efficient and intelligent > design may eliminate the costly skipping of blocks via round-robin selection > of blocks from the storages based on remaining capacity. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized
[ https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805492#comment-15805492 ] Junping Du commented on HDFS-11209: --- No progress on this JIRA for a while. move to 2.9 > SNN can't checkpoint when rolling upgrade is not finalized > -- > > Key: HDFS-11209 > URL: https://issues.apache.org/jira/browse/HDFS-11209 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao >Priority: Critical > > Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 > brings this back. > With HDFS-8432, the primary NN will not update the VERSION file to the new > version after running with "rollingUpgrade" option until upgrade is > finalized. This is to support more downgrade use cases. > However, the checkpoint on the SNN is incorrectly updating the VERSION file > when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint > successfully but fail to push it to the primary NN because its version is > higher than the primary NN as shown below. > {code} > 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode > (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint > org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: > Image uploading failed, status: 403, url: > http://NN:50070/imagetransfer?txid=345404754=IMAGE..., > message: This namenode has storage info -60:221856466:1444080250181:clusterX > but the secondary expected -63:221856466:1444080250181:clusterX > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11209) SNN can't checkpoint when rolling upgrade is not finalized
[ https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11209: -- Target Version/s: 2.9.0, 3.0.0-alpha2 (was: 2.8.0, 3.0.0-alpha2) > SNN can't checkpoint when rolling upgrade is not finalized > -- > > Key: HDFS-11209 > URL: https://issues.apache.org/jira/browse/HDFS-11209 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao >Priority: Critical > > Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 > brings this back. > With HDFS-8432, the primary NN will not update the VERSION file to the new > version after running with "rollingUpgrade" option until upgrade is > finalized. This is to support more downgrade use cases. > However, the checkpoint on the SNN is incorrectly updating the VERSION file > when the rollingUpgrade is not finalized yet. As a result, the SNN checkpoint > successfully but fail to push it to the primary NN because its version is > higher than the primary NN as shown below. > {code} > 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode > (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint > org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: > Image uploading failed, status: 403, url: > http://NN:50070/imagetransfer?txid=345404754=IMAGE..., > message: This namenode has storage info -60:221856466:1444080250181:clusterX > but the secondary expected -63:221856466:1444080250181:clusterX > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9888) Allow reseting KerberosName in unit tests
[ https://issues.apache.org/jira/browse/HDFS-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9888: - Fix Version/s: (was: 2.9.0) > Allow reseting KerberosName in unit tests > - > > Key: HDFS-9888 > URL: https://issues.apache.org/jira/browse/HDFS-9888 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Minor > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-9888.01.patch > > > In some local environments, {{TestBalancer#testBalancerWithKeytabs}} may > fail. Specifically, running itself passes, but running {{TestBalancer}} suite > always fail. This is due to: > # Kerberos setup is done at the test case setup > # static variable {{KerberosName#defaultRealm}} is set when class > initialization - before {{testBalancerWithKeytabs}} setup > # local default realm is different than test case default realm > This is mostly an environment specific problem, but let's not make such > assumption in the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9885) Correct the distcp counters name while displaying counters
[ https://issues.apache.org/jira/browse/HDFS-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9885: - Fix Version/s: (was: 2.9.0) > Correct the distcp counters name while displaying counters > -- > > Key: HDFS-9885 > URL: https://issues.apache.org/jira/browse/HDFS-9885 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.7.1 >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-9885.001.patch, HDFS-9885.002.patch > > > In distcp cmd output, > hadoop distcp hdfs://NN1:port/file1 hdfs://NN2:port/file2 > 16/02/29 07:05:55 INFO tools.DistCp: DistCp job-id: job_1456729398560_0002 > 16/02/29 07:05:55 INFO mapreduce.Job: Running job: job_1456729398560_0002 > 16/02/29 07:06:01 INFO mapreduce.Job: Job job_1456729398560_0002 running in > uber mode : false > 16/02/29 07:06:01 INFO mapreduce.Job: map 0% reduce 0% > 16/02/29 07:06:06 INFO mapreduce.Job: map 100% reduce 0% > 16/02/29 07:06:07 INFO mapreduce.Job: Job job_1456729398560_0002 completed > successfully > ... > ... > File Input Format Counters > Bytes Read=212 > File Output Format Counters > Bytes Written=0{color:red} > org.apache.hadoop.tools.mapred.CopyMapper$Counter > {color} > BANDWIDTH_IN_BYTES=12418 > BYTESCOPIED=12418 > BYTESEXPECTED=12418 > COPY=1 > Expected: > Display Name can be given instead of > {color:red}"org.apache.hadoop.tools.mapred.CopyMapper$Counter"{color} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11278) Add missing @Test annotation for TestSafeMode.testSafeModeUtils()
[ https://issues.apache.org/jira/browse/HDFS-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11278: -- Fix Version/s: (was: 2.9.0) > Add missing @Test annotation for TestSafeMode.testSafeModeUtils() > - > > Key: HDFS-11278 > URL: https://issues.apache.org/jira/browse/HDFS-11278 > Project: Hadoop HDFS > Issue Type: Test > Components: namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Lukas Majercak >Assignee: Lukas Majercak >Priority: Trivial > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-11278.001.patch > > Original Estimate: 0h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10960) TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error verification after volume remove
[ https://issues.apache.org/jira/browse/HDFS-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10960: -- Fix Version/s: (was: 2.9.0) > TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error > verification after volume remove > > > Key: HDFS-10960 > URL: https://issues.apache.org/jira/browse/HDFS-10960 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.0-alpha2 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10960.01.patch, HDFS-10960.02.patch > > > TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails occasionally in > the following verification. > {code} > 700 // If an IOException thrown from BlockReceiver#run, it triggers > 701 // DataNode#checkDiskError(). So we can test whether > checkDiskError() is called, > 702 // to see whether there is IOException in BlockReceiver#run(). > 703 assertEquals(lastTimeDiskErrorCheck, dn.getLastDiskErrorCheck()); > 704 > {code} > {noformat} > Error Message > expected:<0> but was:<6498109> > Stacktrace > java.lang.AssertionError: expected:<0> but was:<6498109> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWrittenForDatanode(TestDataNodeHotSwapVolumes.java:703) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten(TestDataNodeHotSwapVolumes.java:620) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3702: - Fix Version/s: (was: 2.9.0) > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs
[ https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10569: -- Fix Version/s: (was: 2.9.0) > A bug causes OutOfIndex error in BlockListAsLongs > - > > Key: HDFS-10569 > URL: https://issues.apache.org/jira/browse/HDFS-10569 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, > HDFS-10569.003.patch, HDFS-10569.004.patch > > > An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* > is +2 to the size of *values*, but the for-loop accesses *values* using > *longs* index. This will cause OutOfIndex. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10886) Replace "fs.default.name" with "fs.defaultFS" in viewfs document
[ https://issues.apache.org/jira/browse/HDFS-10886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10886: -- Fix Version/s: (was: 2.9.0) > Replace "fs.default.name" with "fs.defaultFS" in viewfs document > > > Key: HDFS-10886 > URL: https://issues.apache.org/jira/browse/HDFS-10886 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation, federation >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10886-002.patch, HDFS-10886.patch > > > As we given two sections,update in *New World – Federation and ViewFs* not > in *The Old World (Prior to Federation)* section. > "fs.default.name" is deprecated, we should use "fs.defaultFS" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list
[ https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-742: Fix Version/s: (was: 2.9.0) > A down DataNode makes Balancer to hang on repeatingly asking NameNode its > partial block list > > > Key: HDFS-742 > URL: https://issues.apache.org/jira/browse/HDFS-742 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: Hairong Kuang >Assignee: Mit Desai >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-742-trunk.patch, HDFS-742.patch, > HDFS-742.v1.trunk.patch > > > We had a balancer that had not made any progress for a long time. It turned > out it was repeatingly asking Namenode for a partial block list of one > datanode, which was done while the balancer was running. > NameNode should notify Balancer that the datanode is not available and > Balancer should stop asking for the datanode's block list. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11160) VolumeScanner reports write-in-progress replicas as corrupt incorrectly
[ https://issues.apache.org/jira/browse/HDFS-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11160: -- Fix Version/s: (was: 2.9.0) > VolumeScanner reports write-in-progress replicas as corrupt incorrectly > --- > > Key: HDFS-11160 > URL: https://issues.apache.org/jira/browse/HDFS-11160 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Environment: CDH5.7.4 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11160.001.patch, HDFS-11160.002.patch, > HDFS-11160.003.patch, HDFS-11160.004.patch, HDFS-11160.005.patch, > HDFS-11160.006.patch, HDFS-11160.007.patch, HDFS-11160.008.patch, > HDFS-11160.branch-2.patch, HDFS-11160.reproduce.patch > > > Due to a race condition initially reported in HDFS-6804, VolumeScanner may > erroneously detect good replicas as corrupt. This is serious because in some > cases it results in data loss if all replicas are declared corrupt. This bug > is especially prominent when there are a lot of append requests via > HttpFs/WebHDFS. > We are investigating an incidence that caused very high block corruption rate > in a relatively small cluster. Initially, we thought HDFS-11056 is to blame. > However, after applying HDFS-11056, we are still seeing VolumeScanner > reporting corrupt replicas. > It turns out that if a replica is being appended while VolumeScanner is > scanning it, VolumeScanner may use the new checksum to compare against old > data, causing checksum mismatch. > I have a unit test to reproduce the error. Will attach later. A quick and > simple fix is to hold FsDatasetImpl lock and read from disk the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8492) DN should notify NN when client requests a missing block
[ https://issues.apache.org/jira/browse/HDFS-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8492: - Fix Version/s: (was: 2.9.0) > DN should notify NN when client requests a missing block > > > Key: HDFS-8492 > URL: https://issues.apache.org/jira/browse/HDFS-8492 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Walter Su > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-8492-02.patch, HDFS-8492.01.patch > > > If the DN has a block its volume map but not on-disk, it tells clients it's > an invalid block id. The NN is not informed of the missing block until > either the bp slice scanner or the directory scanner detects the missing > block. DN should remove the replica from the volume map and inform the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10485) Fix findbugs warning in FSEditLog.java
[ https://issues.apache.org/jira/browse/HDFS-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10485: -- Fix Version/s: (was: 2.9.0) > Fix findbugs warning in FSEditLog.java > -- > > Key: HDFS-10485 > URL: https://issues.apache.org/jira/browse/HDFS-10485 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-10485.01.branch-2.patch, HDFS-10485.01.patch, > HDFS-10485.02.branch-2.patch, HDFS-10485.03.branch-2.patch > > > Found 1 findbugs warning when creating a patch for branch-2 in HDFS-10341 > (https://builds.apache.org/job/PreCommit-HDFS-Build/15639/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html) > {noformat} > Inconsistent synchronization of > org.apache.hadoop.hdfs.server.namenode.FSEditLog.numTransactionsBatchedInSync; > locked 50% of time > Bug type IS2_INCONSISTENT_SYNC (click for details) > In class org.apache.hadoop.hdfs.server.namenode.FSEditLog > Field > org.apache.hadoop.hdfs.server.namenode.FSEditLog.numTransactionsBatchedInSync > Synchronized 50% of the time > Unsynchronized access at FSEditLog.java:[line 676] > Unsynchronized access at FSEditLog.java:[line 676] > Synchronized access at FSEditLog.java:[line 1254] > Synchronized access at FSEditLog.java:[line 716] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10656) Optimize conversion of byte arrays back to path string
[ https://issues.apache.org/jira/browse/HDFS-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10656: -- Fix Version/s: (was: 2.9.0) > Optimize conversion of byte arrays back to path string > -- > > Key: HDFS-10656 > URL: https://issues.apache.org/jira/browse/HDFS-10656 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-10656.patch > > > {{DFSUtil.byteArray2PathString}} generates excessive object allocation. > # each byte array is encoded to a string (copy) > # string appended to a builder which extracts the chars from the intermediate > string (copy) and adds to its own char array > # builder's char array is re-alloced if over 16 chars (copy) > # builder's toString creates another string (copy) > Instead of allocating all these objects and performing multiple byte/char > encoding/decoding conversions, the byte array can be built in-place with a > single final conversion to a string. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9444) Add utility to find set of available ephemeral ports to ServerSocketUtil
[ https://issues.apache.org/jira/browse/HDFS-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9444: - Fix Version/s: (was: 2.9.0) > Add utility to find set of available ephemeral ports to ServerSocketUtil > > > Key: HDFS-9444 > URL: https://issues.apache.org/jira/browse/HDFS-9444 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Masatake Iwasaki > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-9444-branch-2.006.patch, > HDFS-9444-branch-2.007.patch, HDFS-9444.001.patch, HDFS-9444.002.patch, > HDFS-9444.003.patch, HDFS-9444.004.patch, HDFS-9444.005.patch, > HDFS-9444.006.patch > > > Unit tests using MiniDFSCluster with namanode-ha enabled need set of port > numbers in advance. Because namenodes talk to each other, we can not set ipc > port to 0 in configuration to make namenodes decide port number on its own. > ServerSocketUtil should provide utility to find set of available ephemeral > port numbers for this. > For example, TestEditLogTailer could fail due to {{java.net.BindException: > Address already in use}}. > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2556/testReport/ > {noformat} > java.net.BindException: Problem binding to [localhost:42477] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:469) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:695) > at org.apache.hadoop.ipc.Server.(Server.java:2464) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:390) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:742) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:680) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:883) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:862) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1564) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441) > at > org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRolls(TestEditLogTailer.java:139) > at > org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testNN1TriggersLogRolls(TestEditLogTailer.java:114) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11087) NamenodeFsck should check if the output writer is still writable.
[ https://issues.apache.org/jira/browse/HDFS-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-11087: -- Fix Version/s: (was: 2.9.0) > NamenodeFsck should check if the output writer is still writable. > - > > Key: HDFS-11087 > URL: https://issues.apache.org/jira/browse/HDFS-11087 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.5 >Reporter: Konstantin Shvachko >Assignee: Erik Krogen > Fix For: 2.8.0, 2.7.4 > > Attachments: HDFS-11087-branch-2.000.patch, > HDFS-11087-branch-2.001.patch, HDFS-11087.branch-2.000.patch > > > {{NamenodeFsck}} keeps running even after the client was interrupted. So if > you start {{fsck /}} on a large namespace and kill the client, the NameNode > will keep traversing the tree for hours although there is nobody to receive > the result. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8897) Balancer should handle fs.defaultFS trailing slash in HA
[ https://issues.apache.org/jira/browse/HDFS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8897: - Fix Version/s: (was: 2.9.0) > Balancer should handle fs.defaultFS trailing slash in HA > > > Key: HDFS-8897 > URL: https://issues.apache.org/jira/browse/HDFS-8897 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.7.1 > Environment: Centos 6.6 >Reporter: LINTE >Assignee: John Zhuge > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-8897-branch-2.006.patch, HDFS-8897.001.patch, > HDFS-8897.002.patch, HDFS-8897.003.patch, HDFS-8897.004.patch, > HDFS-8897.005.patch, HDFS-8897.006.patch > > > When balancer is launched, it should test if there is already a > /system/balancer.id file in HDFS. > When the file doesn't exist, the balancer don't want to run : > 15/08/14 16:35:12 INFO balancer.Balancer: namenodes = [hdfs://sandbox/, > hdfs://sandbox] > 15/08/14 16:35:12 INFO balancer.Balancer: parameters = > Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration > = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move > Bytes Being Moved > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/08/14 16:35:14 INFO block.BlockTokenSecretManager: Setting block keys > 15/08/14 16:35:14 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > java.io.IOException: Another Balancer is running.. Exiting ... > Aug 14, 2015 4:35:14 PM Balancing took 2.408 seconds > Looking at the audit log file when trying to run the balancer, the balancer > create the /system/balancer.id and then delete it on exiting ... > 2015-08-14 16:37:45,844 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:45,900 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=create > src=/system/balancer.id dst=nullperm=hdfs:hadoop:rw-r- > proto=rpc > 2015-08-14 16:37:45,919 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,090 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,112 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=getfileinfo > src=/system/balancer.id dst=nullperm=null proto=rpc > 2015-08-14 16:37:46,117 INFO FSNamesystem.audit: allowed=true > ugi=hdfs@SANDBOX.HADOOP (auth:KERBEROS) ip=/x.x.x.x cmd=delete > src=/system/balancer.id dst=nullperm=null proto=rpc > The error seems to be located in > org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java > The function checkAndMarkRunning return null even if the /system/balancer.id > doesn't exist before entering this function; if it exists, then it is deleted > and the balancer exit with the same error. > > private OutputStream checkAndMarkRunning() throws IOException { > try { > if (fs.exists(idPath)) { > // try appending to it so that it will fail fast if another balancer > is > // running. > IOUtils.closeStream(fs.append(idPath)); > fs.delete(idPath, true); > } > final FSDataOutputStream fsout = fs.create(idPath, false); > // mark balancer idPath to be deleted during filesystem closure > fs.deleteOnExit(idPath); > if (write2IdFile) { > fsout.writeBytes(InetAddress.getLocalHost().getHostName()); > fsout.hflush(); > } > return fsout; > } catch(RemoteException e) { > > if(AlreadyBeingCreatedException.class.getName().equals(e.getClassName())){ > return null; > } else { > throw e; > } > } > } > > Regards -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5805) TestCheckpoint.testCheckpoint fails intermittently on branch2
[ https://issues.apache.org/jira/browse/HDFS-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5805: - Fix Version/s: (was: 2.9.0) > TestCheckpoint.testCheckpoint fails intermittently on branch2 > - > > Key: HDFS-5805 > URL: https://issues.apache.org/jira/browse/HDFS-5805 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Mit Desai >Assignee: Eric Badger > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-5805.001.patch > > > {noformat} > java.lang.AssertionError: Bad value for metric GetEditAvgTime > Expected: gt(0.0) > got: <0.0> > at org.junit.Assert.assertThat(Assert.java:780) > at > org.apache.hadoop.test.MetricsAsserts.assertGaugeGt(MetricsAsserts.java:341) > at > org.apache.hadoop.hdfs.server.namenode.TestCheckpoint.testCheckpoint(TestCheckpoint.java:1070) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9804) Allow long-running Balancer to login with keytab
[ https://issues.apache.org/jira/browse/HDFS-9804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9804: - Fix Version/s: (was: 2.9.0) > Allow long-running Balancer to login with keytab > > > Key: HDFS-9804 > URL: https://issues.apache.org/jira/browse/HDFS-9804 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover, security >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: supportability > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-9804-branch-2.00.patch, HDFS-9804.01.patch, > HDFS-9804.02.patch, HDFS-9804.03.patch > > > From the discussion of HDFS-9698, it might be nice to allow the balancer to > run as a daemon and login from a keytab. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10336) TestBalancer failing intermittently because of not reseting UserGroupInformation completely
[ https://issues.apache.org/jira/browse/HDFS-10336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10336: -- Fix Version/s: (was: 2.9.0) > TestBalancer failing intermittently because of not reseting > UserGroupInformation completely > --- > > Key: HDFS-10336 > URL: https://issues.apache.org/jira/browse/HDFS-10336 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-alpha1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-10336.001.patch, HDFS-10336.002.patch, > HDFS-10336.003-simplefix.patch, HDFS-10336.003.patch > > > The unit test {{TestBalancer}} failed sometimes. > I looked for the reason. I found two main reasons causing this. > * 1st. The test {{TestBalancer#testBalancerWithKeytabs}} executed timeout. > {code} > org.apache.hadoop.hdfs.server.balancer.TestBalancer > testBalancerWithKeytabs(org.apache.hadoop.hdfs.server.balancer.TestBalancer) > Time elapsed: 300.41 sec <<< ERROR! > java.lang.Exception: test timed out after 30 milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.waitForMoveCompletion(Dispatcher.java:1122) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchBlockMoves(Dispatcher.java:1096) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchAndCheckContinue(Dispatcher.java:1060) > at > org.apache.hadoop.hdfs.server.balancer.Balancer.runOneIteration(Balancer.java:635) > at > org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:689) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode(TestBalancer.java:1098) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancer.access$000(TestBalancer.java:125) > {code} > * 2nd. The test {{TestBalancer#testBalancerWithKeytabs}} reset the {{UGI}} > not completely sometimes in the finally block. And this affected the other > unit tests threw {{IOException}}, like this: > {code} > testBalancerWithNonZeroThreadsForMove(org.apache.hadoop.hdfs.server.balancer.TestBalancer) > Time elapsed: 0 sec <<< ERROR! > java.io.IOException: Running in secure mode, but config doesn't have a keytab > at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:300) > {code} > And there were not only one test will be affected by this. We should add a > line to do before doing reset {{UGI}} operation and can avoid the potenial > exception happens. > {code} > UserGroupInformation.reset(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10674) Optimize creating a full path from an inode
[ https://issues.apache.org/jira/browse/HDFS-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10674: -- Fix Version/s: (was: 2.9.0) > Optimize creating a full path from an inode > --- > > Key: HDFS-10674 > URL: https://issues.apache.org/jira/browse/HDFS-10674 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-10674.patch > > > {{INode#getFullPathName}} walks up the inode tree, creates a INode[], > converts each component byte[] name to a String while building the path. > This involves many allocations, copies, and char conversions. > The path should be built with a single byte[] allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9855) Modify TestAuditLoggerWithCommands to workaround the absence of HDFS-8332
[ https://issues.apache.org/jira/browse/HDFS-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9855: - Fix Version/s: (was: 2.9.0) > Modify TestAuditLoggerWithCommands to workaround the absence of HDFS-8332 > - > > Key: HDFS-9855 > URL: https://issues.apache.org/jira/browse/HDFS-9855 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.8.0, 2.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0 > > Attachments: HDFS-9855-branch-2.001.patch > > > The addition of setQuota audit log testing throws an AccessControlException > instead of the expected FileSystemClosed IOException even when the filesystem > has been explicitly closed, other calls behave as expected during a trial > test. This is seen on branch-2 and not on trunk requiring investigation for a > possible bug/discrepancy. > CC:[~kihwal]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10544: -- Fix Version/s: (was: 2.9.0) > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, ha >Affects Versions: 2.6.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.6.5, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable
[ https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10798: -- Fix Version/s: (was: 2.9.0) > Make the threshold of reporting FSNamesystem lock contention configurable > - > > Key: HDFS-10798 > URL: https://issues.apache.org/jira/browse/HDFS-10798 > Project: Hadoop HDFS > Issue Type: Improvement > Components: logging, namenode >Reporter: Zhe Zhang >Assignee: Erik Krogen > Labels: newbie > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-10789.001.patch, HDFS-10789.002.patch > > > Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. > In a busy system a lower overhead might be desired. In other scenarios, more > aggressive reporting might be desired. We should make the threshold > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10889) Remove outdated Fault Injection Framework documentaion
[ https://issues.apache.org/jira/browse/HDFS-10889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-10889: -- Fix Version/s: (was: 2.9.0) > Remove outdated Fault Injection Framework documentaion > -- > > Key: HDFS-10889 > URL: https://issues.apache.org/jira/browse/HDFS-10889 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-10889.patch > > > Fault Injection Framework was introduced in HDFS-435..But later related code > was removed but documenation was not removed.. > I feel, we can remove this stale doc.. > http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/FaultInjectFramework.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org