[jira] [Created] (HDFS-10625) VolumeScanner to report why a block is found bad
Yongjun Zhang created HDFS-10625: Summary: VolumeScanner to report why a block is found bad Key: HDFS-10625 URL: https://issues.apache.org/jira/browse/HDFS-10625 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs Reporter: Yongjun Zhang VolumeScanner may report: {code} WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad blk_1170125248_96458336 on /d/dfs/dn {code} It would be helpful to report the reason why the block is bad, especially when the block is corrupt, where is the first corrupted chunk in the block. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Deleted] (HDFS-10624) VolumeScanner to report why a block is found bad
[ https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang deleted HDFS-10624: --- > VolumeScanner to report why a block is found bad > > > Key: HDFS-10624 > URL: https://issues.apache.org/jira/browse/HDFS-10624 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yongjun Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad
[ https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-10624: - Description: (was: Seeing the following on DN log. {code} 2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 received exception java.io.EOFException: Premature EOF: no length prefix available 2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK operation src: /10.204.64.137:45112 dst: /10.204.64.151:1110 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) 2016-04-07 20:27:46,116 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on /ngs8/app/lampp/dfs/dn 2016-04-07 20:27:46,117 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting because of exception java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621) 2016-04-07 20:27:46,118 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting. 2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.204.64.151, datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, infoSecurePort=1175, ipcPort=1120, storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 10.204.64.10:1110 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154) at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Connection
[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad
[ https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-10624: - Description: Seeing the following on DN log. {code} 2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 received exception java.io.EOFException: Premature EOF: no length prefix available 2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK operation src: /10.204.64.137:45112 dst: /10.204.64.151:1110 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) 2016-04-07 20:27:46,116 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on /ngs8/app/lampp/dfs/dn 2016-04-07 20:27:46,117 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting because of exception java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621) 2016-04-07 20:27:46,118 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting. 2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.204.64.151, datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, infoSecurePort=1175, ipcPort=1120, storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 10.204.64.10:1110 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154) at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Connection reset by
[jira] [Created] (HDFS-10624) VolumeScanner to report why a block is found bad
Yongjun Zhang created HDFS-10624: Summary: VolumeScanner to report why a block is found bad Key: HDFS-10624 URL: https://issues.apache.org/jira/browse/HDFS-10624 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs Reporter: Yongjun Zhang Seeing the following on DN log. {code} 2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 received exception java.io.EOFException: Premature EOF: no length prefix available 2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK operation src: /10.204.64.137:45112 dst: /10.204.64.151:1110 java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244) at java.lang.Thread.run(Thread.java:745) 2016-04-07 20:27:46,116 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on /ngs8/app/lampp/dfs/dn 2016-04-07 20:27:46,117 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting because of exception java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621) 2016-04-07 20:27:46,118 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) exiting. 2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.204.64.151, datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, infoSecurePort=1175, ipcPort=1120, storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 10.204.64.10:1110 got java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585) at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154) at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118) at
[jira] [Commented] (HDFS-8065) Erasure coding: Support truncate at striped group boundary
[ https://issues.apache.org/jira/browse/HDFS-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376304#comment-15376304 ] Rakesh R commented on HDFS-8065: Hi, [~drankye], [~umamaheswararao], [~zhz] IMHO, truncate on block group boundary can be supported without much effort compare to the partial stripe case, which(latter) will be addressed via HDFS-7622 jira. Could you please help me in pushing this jira for {{3.0.0-alpha1}} release. Handling of partial stripe logic is needed for lease recovery, truncation, hflush cases which I feel together can be implemented based on the HDFS-7661 design discussions. Welcome thoughts? > Erasure coding: Support truncate at striped group boundary > -- > > Key: HDFS-8065 > URL: https://issues.apache.org/jira/browse/HDFS-8065 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Yi Liu >Assignee: Rakesh R > Attachments: HDFS-8065-00.patch, HDFS-8065-01.patch > > > We can support truncate at striped group boundary firstly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8065) Erasure coding: Support truncate at striped group boundary
[ https://issues.apache.org/jira/browse/HDFS-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8065: --- Summary: Erasure coding: Support truncate at striped group boundary (was: Erasure coding: Support truncate at striped group boundary.) > Erasure coding: Support truncate at striped group boundary > -- > > Key: HDFS-8065 > URL: https://issues.apache.org/jira/browse/HDFS-8065 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Yi Liu >Assignee: Rakesh R > Attachments: HDFS-8065-00.patch, HDFS-8065-01.patch > > > We can support truncate at striped group boundary firstly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376257#comment-15376257 ] Hudson commented on HDFS-10617: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10095 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10095/]) HDFS-10617. PendingReconstructionBlocks.size() should be synchronized. (kihwal: rev 2bbc3ea1b54c25c28eb04caa48dece5cfc19d613) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Fix For: 2.9.0 > > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby
[ https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376238#comment-15376238 ] Rakesh R commented on HDFS-10477: - It looks like test case is failing due to lock release, please check. Secondly, when catching and swallowing {{InterruptedException}}, should we call {{Thread.currentThread().interrupt()}} afterward, so that the interrupt status isn't lost. {code} java.lang.IllegalMonitorStateException: null at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:371) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1533) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processExtraRedundancyBlocksOnReCommission(BlockManager.java:3861) at org.apache.hadoop.hdfs.server.blockmanagement.DecommissionManager.stopDecommission(DecommissionManager.java:221) at org.apache.hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy.testPlacementWithLocalRackNodesDecommissioned(TestDefaultBlockPlacementPolicy.java:117) {code} > Stop decommission a rack of DataNodes caused NameNode fail over to standby > -- > > Key: HDFS-10477 > URL: https://issues.apache.org/jira/browse/HDFS-10477 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: HDFS-10477.002.patch, HDFS-10477.003.patch, > HDFS-10477.004.patch, HDFS-10477.patch > > > In our cluster, when we stop decommissioning a rack which have 46 DataNodes, > it locked Namesystem for about 7 minutes as below log shows: > {code} > 2016-05-26 20:11:41,697 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.27:1004 > 2016-05-26 20:11:51,171 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 285258 over-replicated blocks on 10.142.27.27:1004 during recommissioning > 2016-05-26 20:11:51,171 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.118:1004 > 2016-05-26 20:11:59,972 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 279923 over-replicated blocks on 10.142.27.118:1004 during recommissioning > 2016-05-26 20:11:59,972 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.113:1004 > 2016-05-26 20:12:09,007 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 294307 over-replicated blocks on 10.142.27.113:1004 during recommissioning > 2016-05-26 20:12:09,008 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.117:1004 > 2016-05-26 20:12:18,055 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 314381 over-replicated blocks on 10.142.27.117:1004 during recommissioning > 2016-05-26 20:12:18,056 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.130:1004 > 2016-05-26 20:12:25,938 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 272779 over-replicated blocks on 10.142.27.130:1004 during recommissioning > 2016-05-26 20:12:25,939 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.121:1004 > 2016-05-26 20:12:34,134 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 287248 over-replicated blocks on 10.142.27.121:1004 during recommissioning > 2016-05-26 20:12:34,134 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.33:1004 > 2016-05-26 20:12:43,020 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 299868 over-replicated blocks on 10.142.27.33:1004 during recommissioning > 2016-05-26 20:12:43,020 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.137:1004 > 2016-05-26 20:12:52,220 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 303914 over-replicated blocks on 10.142.27.137:1004 during recommissioning > 2016-05-26 20:12:52,220 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.51:1004 > 2016-05-26 20:13:00,362 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 281175 over-replicated blocks on
[jira] [Created] (HDFS-10623) Remove unused import of httpclient.HttpConnection from TestWebHdfsTokens.
Jitendra Nath Pandey created HDFS-10623: --- Summary: Remove unused import of httpclient.HttpConnection from TestWebHdfsTokens. Key: HDFS-10623 URL: https://issues.apache.org/jira/browse/HDFS-10623 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Reporter: Jitendra Nath Pandey Assignee: Hanisha Koneru TestWebHdfsTokens imports httpclient.HttpConnection, and causes unnecessary reference to httpclient. This can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-10617: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Fix For: 2.9.0 > > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10612) Optimize mechanism when block report size exceed the limit of PB message
[ https://issues.apache.org/jira/browse/HDFS-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376229#comment-15376229 ] Yuanbo Liu commented on HDFS-10612: --- Block report size is difficult to calculate, for the block number and block report size are not in linear relationship if datanode use blockbuffer. And it will bring performance loss to calculate report size when we write block. The least we can do is to add warn log, make block report size as a metric and add this metric to datanode web ui. > Optimize mechanism when block report size exceed the limit of PB message > > > Key: HDFS-10612 > URL: https://issues.apache.org/jira/browse/HDFS-10612 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yuanbo Liu > > Community has made block report size configurable in HDFS-10312. But there is > still a risk for Hadoop. If block report size exceeds PB message size, the > cluster may be in a danger situation -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10612) Optimize mechanism when block report size exceed the limit of PB message
[ https://issues.apache.org/jira/browse/HDFS-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanbo Liu reassigned HDFS-10612: - Assignee: Yuanbo Liu > Optimize mechanism when block report size exceed the limit of PB message > > > Key: HDFS-10612 > URL: https://issues.apache.org/jira/browse/HDFS-10612 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yuanbo Liu >Assignee: Yuanbo Liu > > Community has made block report size configurable in HDFS-10312. But there is > still a risk for Hadoop. If block report size exceeds PB message size, the > cluster may be in a danger situation -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376227#comment-15376227 ] Kihwal Lee commented on HDFS-10617: --- +1 lgtm > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10534) NameNode WebUI should display DataNode usage histogram
[ https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376186#comment-15376186 ] Kai Sasaki commented on HDFS-10534: --- [~zhz] Oh, you are next to dust team! Thanks! > NameNode WebUI should display DataNode usage histogram > -- > > Key: HDFS-10534 > URL: https://issues.apache.org/jira/browse/HDFS-10534 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode, ui >Reporter: Zhe Zhang >Assignee: Kai Sasaki > Attachments: HDFS-10534.01.patch, HDFS-10534.02.patch, > HDFS-10534.03.patch, HDFS-10534.04.patch, HDFS-10534.05.patch, > HDFS-10534.06.patch, HDFS-10534.07.patch, HDFS-10534.08.patch, Screen Shot > 2016-06-23 at 6.25.50 AM.png, Screen Shot 2016-07-07 at 23.29.14.png, > table_histogram.html > > > In addition of *Min/Median/Max*, another meaningful metric for cluster > balance is DN usage in histogram style. > Since NN already has provided necessary information to calculate histogram of > DN usage, it can be done in JS side. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10467) Router-based HDFS federation
[ https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376170#comment-15376170 ] Inigo Goiri commented on HDFS-10467: After checking the code, I think there might a bunch of overlaps between this work and YARN-2915. I'd like to explore what we could move into Hadoop commons to manage a federated space. I would probably open a new JIRA for that. In addition, given the feedback collected during the last few weeks, it seems like the community is OK with going into this direction so I'd like to start moving the review process forward. To simplify the review, I propose to convert this JIRA into an umbrella and split the current patch into smaller subtasks. For now, I would like to start with: # Minimum Router # State Store interface # ZooKeeper State Store implementation We can add more tasks if people think is the way to do. Probably, it's a good idea to create a new branch for this effort. Thoughts? Opinions? > Router-based HDFS federation > > > Key: HDFS-10467 > URL: https://issues.apache.org/jira/browse/HDFS-10467 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 2.7.2 >Reporter: Inigo Goiri > Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, > HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch > > > Add a Router to provide a federated view of multiple HDFS clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10619) Cache path in InodesInPath
[ https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376158#comment-15376158 ] Yiqun Lin commented on HDFS-10619: -- Hi, [~daryn], the patch looks good. But the failed test seemed related. I tested you patch in my local env. I found sometimes the bytes array {{path}} that passed will be like this: {code} [null, [102, 111, 111]] {code} The pathComponents[0] was null, but the pathComponents[1] has the values, then the method {{DFSUtil#byteArray2PathString}} it will throw NPE. Can we add this logic change to avoid this special case? {code} public static String byteArray2PathString(byte[][] pathComponents, int offset, int length) { if (pathComponents.length == 0) { return ""; } Preconditions.checkArgument(offset >= 0 && offset < pathComponents.length); Preconditions.checkArgument(length >= 0 && offset + length <= pathComponents.length); if (pathComponents.length == 1 && (pathComponents[0] == null || pathComponents[0].length == 0)) { return Path.SEPARATOR; } else if (pathComponents.length > 1 && (pathComponents[0] == null || pathComponents[0].length == 0)) { // Add this logic return ""; } ... {code} > Cache path in InodesInPath > -- > > Key: HDFS-10619 > URL: https://issues.apache.org/jira/browse/HDFS-10619 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-10619.patch > > > INodesInPath#getPath, a frequently called method, dynamically builds the > path. IIP should cache the path upon construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby
[ https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376069#comment-15376069 ] Hadoop QA commented on HDFS-10477: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 21s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817821/HDFS-10477.004.patch | | JIRA Issue | HDFS-10477 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b7d6870f248a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d180505 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16052/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16052/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16052/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Stop decommission a rack of DataNodes caused NameNode fail over to standby > -- > > Key: HDFS-10477 >
[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support
[ https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376060#comment-15376060 ] Hadoop QA commented on HDFS-10441: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 23s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 26s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 42s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 54s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817832/HDFS-10441.HDFS-8707.013.patch | | JIRA Issue | HDFS-10441 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux d22cdda6748f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / d18e396 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16053/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16053/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > libhdfs++: HA namenode support > -- > > Key: HDFS-10441 > URL: https://issues.apache.org/jira/browse/HDFS-10441 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10441.HDFS-8707.000.patch, >
[jira] [Updated] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing
[ https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayi Zhou updated HDFS-10519: -- Attachment: HDFS-10519.007.patch Add a new parameter isTail to selectInputStream() on the NameNode side and a field in RemoteEditLogManifest. When we do in-progress tailing, we'll use committedTxnId rather than highestWrittenTxnId. This won't affect other parts which also need to select in-progress edits. > Add a configuration option to enable in-progress edit log tailing > - > > Key: HDFS-10519 > URL: https://issues.apache.org/jira/browse/HDFS-10519 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha >Reporter: Jiayi Zhou >Assignee: Jiayi Zhou >Priority: Minor > Attachments: HDFS-10519.001.patch, HDFS-10519.002.patch, > HDFS-10519.003.patch, HDFS-10519.004.patch, HDFS-10519.005.patch, > HDFS-10519.006.patch, HDFS-10519.007.patch > > > Standby Namenode has the option to do in-progress edit log tailing to improve > the data freshness. In-progress tailing is already implemented, but it's not > enabled as default configuration. And there's no related configuration key to > turn it on. > Adding a related configuration key to let Standby Namenode is reasonable and > would be a basis for further improvement on Standby Namenode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10601) Improve log message to include hostname when the NameNode is in safemode
[ https://issues.apache.org/jira/browse/HDFS-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376009#comment-15376009 ] Daniel Templeton commented on HDFS-10601: - LGTM. > Improve log message to include hostname when the NameNode is in safemode > > > Key: HDFS-10601 > URL: https://issues.apache.org/jira/browse/HDFS-10601 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Minor > Attachments: HDFS-10601.001.patch, HDFS-10601.002.patch > > > When remote NN operations are involved, it would be nice to have the Namenode > hostname in safemode notification log. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376006#comment-15376006 ] Lei (Eddy) Xu commented on HDFS-10598: -- Great. Thanks Arpit . > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support
[ https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10441: --- Attachment: HDFS-10441.HDFS-8707.013.patch Thanks for the review [~xiaowei.zhu]. bq. In the for loop, it should be standby_info_ instead of active_info_. That would have been nasty to debug.. bq. Another small indent problem in the same file Fixed that too. [~bobhansen] Would you mind taking another look at this when you get a chance? > libhdfs++: HA namenode support > -- > > Key: HDFS-10441 > URL: https://issues.apache.org/jira/browse/HDFS-10441 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10441.HDFS-8707.000.patch, > HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, > HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, > HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, > HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, > HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, > HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, > HDFS-8707.HDFS-10441.001.patch > > > If a cluster is HA enabled then do proper failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375952#comment-15375952 ] Hadoop QA commented on HDFS-10544: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2976 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 1m 19s{color} | {color:red} The patch 78 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 7s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}166m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | JDK v1.7.0_101 Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:c420dfe | | JIRA Patch URL |
[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375945#comment-15375945 ] Hadoop QA commented on HDFS-10620: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}100m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817797/HDFS-10620.001.patch | | JIRA Issue | HDFS-10620 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux c860fc399cf4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / af8f480 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16050/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16050/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16050/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 >
[jira] [Created] (HDFS-10622) o.a.h.security.TestGroupsCaching.testBackgroundRefreshCounters seems flaky
Mingliang Liu created HDFS-10622: Summary: o.a.h.security.TestGroupsCaching.testBackgroundRefreshCounters seems flaky Key: HDFS-10622 URL: https://issues.apache.org/jira/browse/HDFS-10622 Project: Hadoop HDFS Issue Type: Bug Components: security, test Affects Versions: 2.8.0 Reporter: Mingliang Liu h5. Error Message expected:<1> but was:<0> h5. Stacktrace java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.security.TestGroupsCaching.testBackgroundRefreshCounters(TestGroupsCaching.java:638) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby
[ https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yunjiong zhao updated HDFS-10477: - Attachment: HDFS-10477.004.patch Update patch with below changes: 1. release lock after finish process one storage 2. sleep 1 millisecond before try to require lock again Thanks [~arpiagariu] and [~kihwal]. > Stop decommission a rack of DataNodes caused NameNode fail over to standby > -- > > Key: HDFS-10477 > URL: https://issues.apache.org/jira/browse/HDFS-10477 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.2 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: HDFS-10477.002.patch, HDFS-10477.003.patch, > HDFS-10477.004.patch, HDFS-10477.patch > > > In our cluster, when we stop decommissioning a rack which have 46 DataNodes, > it locked Namesystem for about 7 minutes as below log shows: > {code} > 2016-05-26 20:11:41,697 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.27:1004 > 2016-05-26 20:11:51,171 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 285258 over-replicated blocks on 10.142.27.27:1004 during recommissioning > 2016-05-26 20:11:51,171 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.118:1004 > 2016-05-26 20:11:59,972 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 279923 over-replicated blocks on 10.142.27.118:1004 during recommissioning > 2016-05-26 20:11:59,972 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.113:1004 > 2016-05-26 20:12:09,007 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 294307 over-replicated blocks on 10.142.27.113:1004 during recommissioning > 2016-05-26 20:12:09,008 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.117:1004 > 2016-05-26 20:12:18,055 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 314381 over-replicated blocks on 10.142.27.117:1004 during recommissioning > 2016-05-26 20:12:18,056 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.130:1004 > 2016-05-26 20:12:25,938 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 272779 over-replicated blocks on 10.142.27.130:1004 during recommissioning > 2016-05-26 20:12:25,939 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.121:1004 > 2016-05-26 20:12:34,134 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 287248 over-replicated blocks on 10.142.27.121:1004 during recommissioning > 2016-05-26 20:12:34,134 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.33:1004 > 2016-05-26 20:12:43,020 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 299868 over-replicated blocks on 10.142.27.33:1004 during recommissioning > 2016-05-26 20:12:43,020 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.137:1004 > 2016-05-26 20:12:52,220 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 303914 over-replicated blocks on 10.142.27.137:1004 during recommissioning > 2016-05-26 20:12:52,220 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.51:1004 > 2016-05-26 20:13:00,362 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 281175 over-replicated blocks on 10.142.27.51:1004 during recommissioning > 2016-05-26 20:13:00,362 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.12:1004 > 2016-05-26 20:13:08,756 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 274880 over-replicated blocks on 10.142.27.12:1004 during recommissioning > 2016-05-26 20:13:08,757 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.15:1004 > 2016-05-26 20:13:17,185 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 286334 over-replicated blocks on 10.142.27.15:1004 during recommissioning > 2016-05-26 20:13:17,185 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop > Decommissioning 10.142.27.14:1004 > 2016-05-26 20:13:25,369 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated > 280219 over-replicated blocks on 10.142.27.14:1004 during
[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption
[ https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375922#comment-15375922 ] Yongjun Zhang commented on HDFS-10587: -- The block corruption appears to be corrupted at the very beginning of the chunk right after the block transfer (that copy data up to the previous chunk end). The looks similar to HDFS-4660. Unfortunately we don't have the exact block file and checksum file on the source and the destination to compare. Otherwise, it would be easier to tell what might have happened. > Incorrect offset/length calculation in pipeline recovery causes block > corruption > > > Key: HDFS-10587 > URL: https://issues.apache.org/jira/browse/HDFS-10587 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10587.001.patch > > > We found incorrect offset and length calculation in pipeline recovery may > cause block corruption and results in missing blocks under a very unfortunate > scenario. > (1) A client established pipeline and started writing data to the pipeline. > (2) One of the data node in the pipeline restarted, closing the socket, and > some written data were unacknowledged. > (3) Client replaced the failed data node with a new one, initiating block > transfer to copy existing data in the block to the new datanode. > (4) The block is transferred to the new node. Crucially, the entire block, > including the unacknowledged data, was transferred. > (5) The last chunk (512 bytes) was not a full chunk, but the destination > still reserved the whole chunk in its buffer, and wrote the entire buffer to > disk, therefore some written data is garbage. > (6) When the transfer was done, the destination data node converted the > replica from temporary to rbw, which made its visible length as the length of > bytes on disk. That is to say, it thought whatever was transferred was > acknowledged. However, the visible length of the replica is different (round > up to the next multiple of 512) than the source of transfer. [1] > (7) Client then truncated the block in the attempt to remove unacknowledged > data. However, because the visible length is equivalent of the bytes on disk, > it did not truncate unacknowledged data. > (8) When new data was appended to the destination, it skipped the bytes > already on disk. Therefore, whatever was written as garbage was not replaced. > (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it > wouldn’t tell NameNode to mark the replica as corrupt, so the client > continued to form a pipeline using the corrupt replica. > (10) Finally the DN that had the only healthy replica was restarted. NameNode > then update the pipeline to only contain the corrupt replica. > (11) Client continue to write to the corrupt replica, because neither client > nor the data node itself knows the replica is corrupt. When the restarted > datanodes comes back, their replica are stale, despite they are not corrupt. > Therefore, none of the replica is good and up to date. > The sequence of events was reconstructed based on DataNode/NameNode log and > my understanding of code. > Incidentally, we have observed the same sequence of events on two independent > clusters. > [1] > The sender has the replica as follows: > 2016-04-15 22:03:05,066 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW > getNumBytes() = 41381376 > getBytesOnDisk() = 41381376 > getVisibleLength()= 41186444 > getVolume() = /hadoop-i/data/current > getBlockFile()= > /hadoop-i/data/current/BP-1043567091-10.216.26.120-1343682168507/current/rbw/blk_1556997324 > bytesAcked=41186444 > bytesOnDisk=41381376 > while the receiver has the replica as follows: > 2016-04-15 22:03:05,068 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW > getNumBytes() = 41186816 > getBytesOnDisk() = 41186816 > getVisibleLength()= 41186816 > getVolume() = /hadoop-g/data/current > getBlockFile()= > /hadoop-g/data/current/BP-1043567091-10.216.26.120-1343682168507/current/rbw/blk_1556997324 > bytesAcked=41186816 > bytesOnDisk=41186816 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support
[ https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375865#comment-15375865 ] Xiaowei Zhu commented on HDFS-10441: The latest patch looks almost good to go with one small typo in rpc_engine.cc: {code} bool HANamenodeTracker::IsCurrentStandby_locked(const ::asio::ip::tcp::endpoint ) const { for(unsigned int i=0;ilibhdfs++: HA namenode support > -- > > Key: HDFS-10441 > URL: https://issues.apache.org/jira/browse/HDFS-10441 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10441.HDFS-8707.000.patch, > HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, > HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, > HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, > HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, > HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, > HDFS-10441.HDFS-8707.012.patch, HDFS-8707.HDFS-10441.001.patch > > > If a cluster is HA enabled then do proper failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Component/s: ha balancer & mover > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, ha >Affects Versions: 2.6.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Affects Version/s: 2.6.1 > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, ha >Affects Versions: 2.6.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Target Version/s: 2.6.5 (was: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1) > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover, ha >Affects Versions: 2.6.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-10598: - Affects Version/s: (was: 2.8.0) > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.6.5 2.7.3 Status: Resolved (was: Patch Available) I just pushed to branch-2.7 and branch-2.6. Thanks again for the review from [~shv]! > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375848#comment-15375848 ] Arpit Agarwal commented on HDFS-10598: -- Disk Balancer is not in branch-2 so I've set updated the versions accordingly. > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-10598: - Target Version/s: 3.0.0-beta1 (was: 2.9.0, 3.0.0-beta1) > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375844#comment-15375844 ] Arpit Agarwal commented on HDFS-10598: -- Hi [~eddyxu], thanks for reporting this problem and posting a patch. I believe Anu is out on vacation for the next few weeks. I will review your fix. > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 2.8.0, 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10598: - Assignee: Lei (Eddy) Xu (was: Anu Engineer) Fix Version/s: (was: 2.9.0) Status: Patch Available (was: Open) > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 2.8.0, 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Critical > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
[ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10598: - Attachment: HDFS-10598.00.patch Upload the patch that changes {{DiskBalancerMover#copyBlocks}} to not {{setExitFlag}} for normal exit case. And it {{setExitFlag}} from {{executePlan()}}. However, whether it needs to {{setExitFlag()}} in {{executePlan()}} is unclear to me. [~anu] could you give some inputs of the cases it were designed for? Thanks. > DiskBalancer does not execute multi-steps plan. > --- > > Key: HDFS-10598 > URL: https://issues.apache.org/jira/browse/HDFS-10598 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: diskbalancer >Affects Versions: 2.8.0, 3.0.0-beta1 >Reporter: Lei (Eddy) Xu >Assignee: Anu Engineer >Priority: Critical > Fix For: 2.9.0 > > Attachments: HDFS-10598.00.patch > > > I set up a 3 DN node cluster, each one with 2 small disks. After creating > some files to fill HDFS, I added two more small disks to one DN. And run the > diskbalancer on this DataNode. > The disk usage before running diskbalancer: > {code} > /dev/loop0 3.9G 2.1G 1.6G 58% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 17M 3.6G 1% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}}) > {code} > /dev/loop0 3.9G 1.2G 2.5G 32% /mnt/data1 > /dev/loop1 3.9G 2.6G 1.1G 71% /mnt/data2 > /dev/loop2 3.9G 953M 2.7G 26% /mnt/data3 > /dev/loop3 3.9G 17M 3.6G 1% /mnt/data4 > {code} > It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does > {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times > from {{DiskBalancer#executePlan}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375793#comment-15375793 ] Zhe Zhang commented on HDFS-10544: -- Reported test failures on branch-2.7 patch are unrelated and pass locally. Committing to branch-2.7 soon. > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375756#comment-15375756 ] Staffan Friberg commented on HDFS-10620: To avoid all allocation. {noformat} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java index 1a76e09..349b018 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java @@ -1319,7 +1319,8 @@ private void addToInvalidates(BlockInfo storedBlock) { if (!isPopulatingReplQueues()) { return; } -StringBuilder datanodes = new StringBuilder(); +StringBuilder datanodes = blockLog.isDebugEnabled() +? new StringBuilder() : null; for (DatanodeStorageInfo storage : blocksMap.getStorages(storedBlock)) { if (storage.getState() != State.NORMAL) { continue; @@ -1328,10 +1329,12 @@ private void addToInvalidates(BlockInfo storedBlock) { final Block b = getBlockOnStorage(storedBlock, storage); if (b != null) { invalidateBlocks.add(b, node, false); -datanodes.append(node).append(" "); +if (datanodes != null) { + datanodes.append(node).append(" "); +} } } -if (datanodes.length() != 0) { +if (datanodes != null && datanodes.length() != 0) { blockLog.debug("BLOCK* addToInvalidates: {} {}", storedBlock, datanodes); } } {noformat} > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg > Attachments: HDFS-10620.001.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-10620: --- Fix Version/s: (was: 3.0.0-alpha1) > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg > Attachments: HDFS-10620.001.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-10620: --- Attachment: HDFS-10620.001.patch > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10620.001.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled
[ https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-10620: --- Fix Version/s: 3.0.0-alpha1 Status: Patch Available (was: Open) > StringBuilder created and appended even if logging is disabled > -- > > Key: HDFS-10620 > URL: https://issues.apache.org/jira/browse/HDFS-10620 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.4 >Reporter: Staffan Friberg > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10620.001.patch > > > In BlockManager.addToInvalidates the StringBuilder is appended to during the > delete even if logging isn't active. > Could avoid allocating the StringBuilder as well, but not sure if it is > really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10621) libhdfs++: Implement . (dot) and .. (double-dot) semantics
Anatoli Shein created HDFS-10621: Summary: libhdfs++: Implement . (dot) and .. (double-dot) semantics Key: HDFS-10621 URL: https://issues.apache.org/jira/browse/HDFS-10621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Anatoli Shein We need to implement . (dot) and .. (double-dot) semantics in hdfs.cc in getAbsolutePath, hdfsSetWorkingDirectory, hdfsGetWorkingDirectory. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10620) StringBuilder created and appended even if logging is disabled
Staffan Friberg created HDFS-10620: -- Summary: StringBuilder created and appended even if logging is disabled Key: HDFS-10620 URL: https://issues.apache.org/jira/browse/HDFS-10620 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.4 Reporter: Staffan Friberg In BlockManager.addToInvalidates the StringBuilder is appended to during the delete even if logging isn't active. Could avoid allocating the StringBuilder as well, but not sure if it is really worth it to add null handling in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375723#comment-15375723 ] Hadoop QA commented on HDFS-10544: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 28s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 12s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2976 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 1m 17s{color} | {color:red} The patch 78 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 44m 52s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}133m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | JDK v1.7.0_101 Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.tools.TestJMXGet | | | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:c420dfe | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817775/HDFS-10544-branch-2.7.patch | | JIRA Issue |
[jira] [Commented] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing
[ https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375635#comment-15375635 ] Hadoop QA commented on HDFS-10519: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 61m 8s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 81m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817776/HDFS-10519.006.patch | | JIRA Issue | HDFS-10519 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 73c8744b0f21 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / eb47163 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16048/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16048/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add a configuration option to enable in-progress edit log tailing > - > > Key: HDFS-10519 > URL: https://issues.apache.org/jira/browse/HDFS-10519 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha >Reporter: Jiayi Zhou >Assignee: Jiayi Zhou >Priority: Minor > Attachments: HDFS-10519.001.patch,
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375617#comment-15375617 ] Hadoop QA commented on HDFS-10301: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 368 unchanged - 12 fixed = 370 total (was 380) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 58s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 82m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.TestFileChecksum | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817774/HDFS-10301.008.patch | | JIRA Issue | HDFS-10301 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f991214b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / eb47163 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/16045/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/16045/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16045/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16045/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > BlockReport retransmissions may lead to storages falsely being
[jira] [Commented] (HDFS-10619) Cache path in InodesInPath
[ https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375596#comment-15375596 ] Hadoop QA commented on HDFS-10619: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 94m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestSnapshotCommands | | | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.server.namenode.TestGetBlockLocations | | | hadoop.hdfs.TestHDFSFileSystemContract | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs | | | hadoop.hdfs.TestReservedRawPaths | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.TestDatanodeLayoutUpgrade | | | hadoop.hdfs.server.mover.TestStorageMover | | | hadoop.hdfs.TestClientReportBadBlock | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | | hadoop.hdfs.TestDFSShell | | | hadoop.hdfs.server.namenode.ha.TestHAFsck | | | hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot | | | hadoop.cli.TestCryptoAdminCLI | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.TestErasureCodingPolicies | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.fs.viewfs.TestViewFsAtHdfsRoot | | | hadoop.fs.permission.TestStickyBit | | | hadoop.hdfs.TestEncryptionZonesWithKMS | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | | hadoop.fs.TestGlobPaths | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817765/HDFS-10619.patch | | JIRA Issue | HDFS-10619 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite
[jira] [Commented] (HDFS-10619) Cache path in InodesInPath
[ https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375559#comment-15375559 ] Zhe Zhang commented on HDFS-10619: -- This looks a good fix. Thanks Daryn. If an iip is created but {{getPath}} is never called, we are increasing the memory usage by 1 {{String}}. But I think this is pretty rare so overall the fix is an improvement. Thoughts from others? I'll hold off a +1 till the end of today (because of the above tradeoff). > Cache path in InodesInPath > -- > > Key: HDFS-10619 > URL: https://issues.apache.org/jira/browse/HDFS-10619 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-10619.patch > > > INodesInPath#getPath, a frequently called method, dynamically builds the > path. IIP should cache the path upon construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375518#comment-15375518 ] Vinitha Reddy Gankidi commented on HDFS-10301: -- I apologize for attaching a wrong patch. Thanks for pointing it out [~cmccabe]. I uploaded the correct patch now (008) that calls the isStorageReport method. Adding an optional list of storage ID strings in the .proto file would add more overhead since these optional parameters would have to be sent with default values for all other block report RPCs in addition to the last RPC of the block report. I can add more comments in the code to explain what's going on. Thoughts? > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Vinitha Reddy Gankidi >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing
[ https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiayi Zhou updated HDFS-10519: -- Attachment: HDFS-10519.006.patch Also add a boolean flag in Journal for the same purpose. > Add a configuration option to enable in-progress edit log tailing > - > > Key: HDFS-10519 > URL: https://issues.apache.org/jira/browse/HDFS-10519 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha >Reporter: Jiayi Zhou >Assignee: Jiayi Zhou >Priority: Minor > Attachments: HDFS-10519.001.patch, HDFS-10519.002.patch, > HDFS-10519.003.patch, HDFS-10519.004.patch, HDFS-10519.005.patch, > HDFS-10519.006.patch > > > Standby Namenode has the option to do in-progress edit log tailing to improve > the data freshness. In-progress tailing is already implemented, but it's not > enabled as default configuration. And there's no related configuration key to > turn it on. > Adding a related configuration key to let Standby Namenode is reasonable and > would be a basis for further improvement on Standby Namenode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Attachment: HDFS-10544-branch-2.7.patch Attaching branch-2.7 patch to trigger Jenkins. > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, > HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, > HDFS-10544.04.patch, HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinitha Reddy Gankidi updated HDFS-10301: - Attachment: HDFS-10301.008.patch > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Vinitha Reddy Gankidi >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.01.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10619) Cache path in InodesInPath
[ https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-10619: --- Attachment: HDFS-10619.patch > Cache path in InodesInPath > -- > > Key: HDFS-10619 > URL: https://issues.apache.org/jira/browse/HDFS-10619 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-10619.patch > > > INodesInPath#getPath, a frequently called method, dynamically builds the > path. IIP should cache the path upon construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10619) Cache path in InodesInPath
[ https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-10619: --- Status: Patch Available (was: Open) > Cache path in InodesInPath > -- > > Key: HDFS-10619 > URL: https://issues.apache.org/jira/browse/HDFS-10619 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-10619.patch > > > INodesInPath#getPath, a frequently called method, dynamically builds the > path. IIP should cache the path upon construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10619) Cache path in InodesInPath
Daryn Sharp created HDFS-10619: -- Summary: Cache path in InodesInPath Key: HDFS-10619 URL: https://issues.apache.org/jira/browse/HDFS-10619 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs Reporter: Daryn Sharp Assignee: Daryn Sharp INodesInPath#getPath, a frequently called method, dynamically builds the path. IIP should cache the path upon construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10616) Improve performance of path handling
[ https://issues.apache.org/jira/browse/HDFS-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375423#comment-15375423 ] Daryn Sharp commented on HDFS-10616: Will be an umbrella for sub-tasks to incrementally integrate large internal patches. In combination with other internal changes (forthcoming IPC optimizations, other object allocation reductions), heap growth has dramatically slowed. > Improve performance of path handling > > > Key: HDFS-10616 > URL: https://issues.apache.org/jira/browse/HDFS-10616 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > > Path handling in the namesystem and directory is very inefficient. The path > is repeatedly resolved, decomposed into path components, recombined to a full > path. parsed again, throughout the system. This is directly inefficient for > general performance, and indirectly via unnecessary pressure on young gen GC. > The namesystem should only operate on paths, parse it once into inodes, and > the directory should only operate on inodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support
[ https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375380#comment-15375380 ] Hadoop QA commented on HDFS-10441: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 12s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 31s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 22s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 14s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 37s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 37s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 10s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0cf5e66 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817724/HDFS-10441.HDFS-8707.012.patch | | JIRA Issue | HDFS-10441 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 25e8ac932b81 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / d18e396 | | Default Java | 1.7.0_101 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 | | JDK v1.7.0_101 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16043/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16043/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > libhdfs++: HA namenode support > -- > > Key: HDFS-10441 > URL: https://issues.apache.org/jira/browse/HDFS-10441 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10441.HDFS-8707.000.patch, >
[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375352#comment-15375352 ] Hadoop QA commented on HDFS-10617: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m 0s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 81m 13s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817696/HDFS-10617.002.patch | | JIRA Issue | HDFS-10617 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux af8fbc3ab72c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d6d41e8 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16041/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16041/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > >
[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support
[ https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10441: --- Attachment: HDFS-10441.HDFS-8707.012.patch Just rebasing the patch since HDFS-9890 was committed to HDFS-8707. > libhdfs++: HA namenode support > -- > > Key: HDFS-10441 > URL: https://issues.apache.org/jira/browse/HDFS-10441 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10441.HDFS-8707.000.patch, > HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, > HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, > HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, > HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, > HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, > HDFS-10441.HDFS-8707.012.patch, HDFS-8707.HDFS-10441.001.patch > > > If a cluster is HA enabled then do proper failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
[ https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375291#comment-15375291 ] Hadoop QA commented on HDFS-10618: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-10618 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817707/HDFS-10618-b2.001.patch | | JIRA Issue | HDFS-10618 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16042/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > TestPendingReconstruction#testPendingAndInvalidate is flaky due to race > condition > - > > Key: HDFS-10618 > URL: https://issues.apache.org/jira/browse/HDFS-10618 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch > > > TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
[ https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10618: --- Status: Patch Available (was: Open) > TestPendingReconstruction#testPendingAndInvalidate is flaky due to race > condition > - > > Key: HDFS-10618 > URL: https://issues.apache.org/jira/browse/HDFS-10618 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch > > > TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
[ https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10618: --- Attachment: HDFS-10618-b2.001.patch Attaching a branch-2 patch, since reconstruction is known as replication in branch-2 and below > TestPendingReconstruction#testPendingAndInvalidate is flaky due to race > condition > - > > Key: HDFS-10618 > URL: https://issues.apache.org/jira/browse/HDFS-10618 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch > > > TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
[ https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10618: --- Attachment: HDFS-10618.001.patch Attaching a patch that fixes the race in the test by putting all of the pertinent test functionality inside of the write lock. This will prevent the Replication Monitor from running while the test is corrupting and placing blocks in their respective reconstruction structures. > TestPendingReconstruction#testPendingAndInvalidate is flaky due to race > condition > - > > Key: HDFS-10618 > URL: https://issues.apache.org/jira/browse/HDFS-10618 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10618.001.patch > > > TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
[ https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375264#comment-15375264 ] Eric Badger commented on HDFS-10618: Inside of the Replication Monitor, BlockManager.computeReconstructionWorkForBlocks() removes blocks from neededReconstruction, then computes locations for those blocks to be replicated, and then places them into pendingReconstruction. However, before computing the locations the write lock is released (and reacquired to add to pendingReconstruction). testPendingAndInvalidate can expose this race condition because it also indirectly calls BlockManager.computeReconstructionWorkForBlocks. The following scenario outlines how this test can fail: 1. ReplicationMonitor calls computeReconstructionWorkForBlocks, removes blocks from neededReconstruction, releases the write lock, and takes time computing the locations for replication 2. testPendingAndInvalidate calls computeReconstructionWorkForBlocks, sees nothing in neededReconstruction, spends 0 time computing locations, adds nothing to pendingReconstruction, and returns. 3. testPendingAndInvalidate calls updateState() and indirectly sets pendingReconstructionBlocksCount to the current value of pendingReconstruction (which is 0, since the Replication Monitor is still computing the block locations and hasn't yet added the blocks to pendingReconstruction). 3. testPendingAndInvalidate checks the value of pendingReconstructionBlocksCount via getPendingReconstructionBlocksCount() and sees that it is 0, causing the associated assert to fail. It is unclear to me whether or not this failure can happen outside of this test, since it is explicitly calling computeReconstructionWorkForBlocks, which is normally only called by the Replication Monitor. > TestPendingReconstruction#testPendingAndInvalidate is flaky due to race > condition > - > > Key: HDFS-10618 > URL: https://issues.apache.org/jira/browse/HDFS-10618 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Eric Badger >Assignee: Eric Badger > > TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition
Eric Badger created HDFS-10618: -- Summary: TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition Key: HDFS-10618 URL: https://issues.apache.org/jira/browse/HDFS-10618 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Eric Badger Assignee: Eric Badger TestPendingReconstruction#testPendingAndInvalidate fails intermittently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues
[ https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9890: -- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the work [~xiaowei.zhu], I just committed this to HDFS-8707. > libhdfs++: Add test suite to simulate network issues > > > Key: HDFS-9890 > URL: https://issues.apache.org/jira/browse/HDFS-9890 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: Xiaowei Zhu > Attachments: HDFS-9890.HDFS-8707.000.patch, > HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, > HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch, > HDFS-9890.HDFS-8707.005.patch, HDFS-9890.HDFS-8707.006.patch, > HDFS-9890.HDFS-8707.007.patch, HDFS-9890.HDFS-8707.008.patch, > HDFS-9890.HDFS-8707.009.patch, HDFS-9890.HDFS-8707.010.patch, > HDFS-9890.HDFS-8707.011.patch, HDFS-9890.HDFS-8707.012.patch, > HDFS-9890.HDFS-8707.012.patch, HDFS-9890.HDFS-8707.013.patch, > HDFS-9890.HDFS-8707.013.patch, HDFS-9890.HDFS-8707.014.patch, > HDFS-9890.HDFS-8707.015.patch, HDFS-9890.HDFS-8707.016.patch, > HDFS-9890.HDFS-8707.016.patch, hs_err_pid26832.log, hs_err_pid4944.log > > > I propose adding a test suite to simulate various network issues/failures in > order to get good test coverage on some of the retry paths that aren't easy > to hit in mock unit tests. > At the moment the only things that hit the retry paths are the gmock unit > tests. The gmock are only as good as their mock implementations which do a > great job of simulating protocol correctness but not more complex > interactions. They also can't really simulate the types of lock contention > and subtle memory stomps that show up while doing hundreds or thousands of > concurrent reads. We should add a new minidfscluster test that focuses on > heavy read/seek load and then randomly convert error codes returned by > network functions into errors. > List of things to simulate(while heavily loaded), roughly in order of how > badly I think they need to be tested at the moment: > -Rpc connection disconnect > -Rpc connection slowed down enough to cause a timeout and trigger retry > -DN connection disconnect -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Status: Patch Available (was: Open) > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Attachment: HDFS-10617.002.patch Attaching updated trunk patch. Eclipse inserted tabs instead of spaces into the first one. > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, > HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Attachment: HDSF-10617-b2.001.patch Attaching branch-2 patch > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch, HDSF-10617-b2.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Attachment: HDFS-10617.001.patch Attaching patch for trunk > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HDFS-10617.001.patch > > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Description: PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) is a HashMap, which is not a thread-safe data structure. Therefore, the size() function should be synchronized just like the rest of the member functions. (was: pendingReconstructions (pendingReplicationBlocks in branch-2 and below) is a HashMap, which is not a thread-safe data structure. Therefore, the size() function should be synchronized just like the rest of the member functions. ) Summary: PendingReconstructionBlocks.size() should be synchronized (was: PendingReconstructions.size() should be synchronized) > PendingReconstructionBlocks.size() should be synchronized > - > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > > PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) > is a HashMap, which is not a thread-safe data structure. Therefore, the > size() function should be synchronized just like the rest of the member > functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10425) Clean up NNStorage and TestSaveNamespace
[ https://issues.apache.org/jira/browse/HDFS-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor updated HDFS-10425: Attachment: HDFS-10425.02.patch Patch 02. First one was no longer applicable. > Clean up NNStorage and TestSaveNamespace > > > Key: HDFS-10425 > URL: https://issues.apache.org/jira/browse/HDFS-10425 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Trivial > Attachments: HDFS-10425.01.patch, HDFS-10425.02.patch > > > Since I was working with NNStorage and TestSaveNamespace classes it is good > time take care with IDE and checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10617) PendingReconstructions.size() should be synchronized
[ https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HDFS-10617: --- Description: pendingReconstructions (pendingReplicationBlocks in branch-2 and below) is a HashMap, which is not a thread-safe data structure. Therefore, the size() function should be synchronized just like the rest of the member functions. (was: pendingReplications is a HashMap, which is not a thread-safe data structure. Therefore, the size() function should be synchronized just like the rest of the member functions. ) Summary: PendingReconstructions.size() should be synchronized (was: PendingReplicationBlocks.size() should be synchronized) > PendingReconstructions.size() should be synchronized > > > Key: HDFS-10617 > URL: https://issues.apache.org/jira/browse/HDFS-10617 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: Eric Badger >Assignee: Eric Badger > > pendingReconstructions (pendingReplicationBlocks in branch-2 and below) is a > HashMap, which is not a thread-safe data structure. Therefore, the size() > function should be synchronized just like the rest of the member functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10617) PendingReplicationBlocks.size() should be synchronized
Eric Badger created HDFS-10617: -- Summary: PendingReplicationBlocks.size() should be synchronized Key: HDFS-10617 URL: https://issues.apache.org/jira/browse/HDFS-10617 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Eric Badger Assignee: Eric Badger pendingReplications is a HashMap, which is not a thread-safe data structure. Therefore, the size() function should be synchronized just like the rest of the member functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-9809) Abstract implementation-specific details from the datanode
[ https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111 ] Ewan Higgs edited comment on HDFS-9809 at 7/13/16 2:33 PM: --- Hi, In Storage.java, I think a good deal of copyFileBuffered can be replaced with {{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {{fis}} could also be renamed {{fin}} to reflect the opposite of {{fout}}. Across a lot of these files, loggers are using '\+' for string concatenation rather than using sl4j templating (\{\}). There is an ongoing effort (HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more LOG statements with '\+', try to take the opportunity to clean it up as you go (though this just adds to this already rather large patch). was (Author: ehiggs): Hi, In Storage.java, I think a good deal of copyFileBuffered can be replaced with {{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis}] could also be renamed {{fin}} to reflect the opposite of {{fout}}. Across a lot of these files, loggers are using '\+' for string concatenation rather than using sl4j templating (\{\}). There is an ongoing effort (HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more LOG statements with '\+', try to take the opportunity to clean it up as you go (though this just adds to this already rather large patch). > Abstract implementation-specific details from the datanode > -- > > Key: HDFS-9809 > URL: https://issues.apache.org/jira/browse/HDFS-9809 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode, fs >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, > HDFS-9809.003.patch, HDFS-9809.004.patch > > > Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) > implicitly assume that blocks are stored in java.io.File(s) and that volumes > are divided into directories. We propose to abstract these details, which > would help in supporting other storages. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9809) Abstract implementation-specific details from the datanode
[ https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111 ] Ewan Higgs commented on HDFS-9809: -- Hi, In Storage.java, I think a good deal of copyFileBuffered can be replaced with {{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis} could also be renamed {{fin]} to reflect the opposite of {{fout}}. Across a lot of these files, loggers are using '\+' for string concatenation rather than using sl4j templating (\{\}). There is an ongoing effort (HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more LOG statements with '\+', try to take the opportunity to clean it up as you go (though this just adds to this already rather large patch). > Abstract implementation-specific details from the datanode > -- > > Key: HDFS-9809 > URL: https://issues.apache.org/jira/browse/HDFS-9809 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode, fs >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, > HDFS-9809.003.patch, HDFS-9809.004.patch > > > Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) > implicitly assume that blocks are stored in java.io.File(s) and that volumes > are divided into directories. We propose to abstract these details, which > would help in supporting other storages. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-9809) Abstract implementation-specific details from the datanode
[ https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111 ] Ewan Higgs edited comment on HDFS-9809 at 7/13/16 2:32 PM: --- Hi, In Storage.java, I think a good deal of copyFileBuffered can be replaced with {{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis}] could also be renamed {{fin}} to reflect the opposite of {{fout}}. Across a lot of these files, loggers are using '\+' for string concatenation rather than using sl4j templating (\{\}). There is an ongoing effort (HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more LOG statements with '\+', try to take the opportunity to clean it up as you go (though this just adds to this already rather large patch). was (Author: ehiggs): Hi, In Storage.java, I think a good deal of copyFileBuffered can be replaced with {{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis} could also be renamed {{fin]} to reflect the opposite of {{fout}}. Across a lot of these files, loggers are using '\+' for string concatenation rather than using sl4j templating (\{\}). There is an ongoing effort (HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more LOG statements with '\+', try to take the opportunity to clean it up as you go (though this just adds to this already rather large patch). > Abstract implementation-specific details from the datanode > -- > > Key: HDFS-9809 > URL: https://issues.apache.org/jira/browse/HDFS-9809 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode, fs >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, > HDFS-9809.003.patch, HDFS-9809.004.patch > > > Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) > implicitly assume that blocks are stored in java.io.File(s) and that volumes > are divided into directories. We propose to abstract these details, which > would help in supporting other storages. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured
[ https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374540#comment-15374540 ] Hadoop QA commented on HDFS-10606: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 11s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 24s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} branch-2.7 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 40s{color} | {color:red} hadoop-common-project/hadoop-common in branch-2.7 has 3 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 22s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 15s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 15s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 8 new + 129 unchanged - 0 fixed = 137 total (was 129) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2545 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 1m 14s{color} | {color:red} The patch 70 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 34s{color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_101. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 27s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 86m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.http.TestSSLHttpServer | | JDK v1.8.0_91 Timed out junit tests | org.apache.hadoop.conf.TestConfiguration | | JDK v1.7.0_101 Failed junit tests | hadoop.ha.TestZKFailoverController | | JDK v1.7.0_101 Timed out junit tests |
[jira] [Commented] (HDFS-3051) A zero-copy ScatterGatherRead api from FSDataInputStream
[ https://issues.apache.org/jira/browse/HDFS-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374535#comment-15374535 ] Ravikumar commented on HDFS-3051: - How about returning the MappedByteBuffers of all blocks for a file in local. If there are non-local blocks, this method can simply return empty. public List readFullyScatterGatherLocal(EnumSet options) throws IOException { return ((PositionedReadable)in).readFullyScatterGather(options); } A quick sample-impl can be like public List readFullyScatterGatherLocal(EnumSet) throws IOException { List blockRange = getBlockRange(0, getFileLength()); if(!allBlocksInLocal(blockRange)) { return; } List retval = new LinkedList(); for(LocatedBlock blk:blockRange) { blkReader = fetchBlockReader(blk, localDNAddrPair); ClientMmap mmap = blkReader.getClientMmap(readOptions); mmap.setunmap(false); //Instruction to cache-eviction to avoid unmapping this. Slots, streams & all other resources will be closed result.add(mmap.getMappedByteBuffer()); closeBlockReader(blkReader); } return retval } Apps opening InputStreams only once (Hbase??) can call this method & use the zero-copy buffers for reads, if file is local. If not available, they can fall back to regular DFSInputStream. Reads can eliminate sync overheads & get same perf as a local filesystem. But I don't know if "leaking" MappedByteBuffers to calling code can have nasty side-effects. > A zero-copy ScatterGatherRead api from FSDataInputStream > > > Key: HDFS-3051 > URL: https://issues.apache.org/jira/browse/HDFS-3051 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client, performance >Reporter: dhruba borthakur >Assignee: dhruba borthakur > > It will be nice if we can get a new API from FSDtaInputStream that allows for > zero-copy read for hdfs readers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374536#comment-15374536 ] Hadoop QA commented on HDFS-10544: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 135 unchanged - 0 fixed = 137 total (was 135) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 52s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 80m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817602/HDFS-10544.05.patch | | JIRA Issue | HDFS-10544 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2d63fcb45c0a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 06c56ff | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/16039/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/16039/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16039/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > >
[jira] [Commented] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
[ https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374519#comment-15374519 ] Rakesh R commented on HDFS-10590: - Thank you [~umamaheswararao] for reviewing and committing the patch. > Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures > > > Key: HDFS-10590 > URL: https://issues.apache.org/jira/browse/HDFS-10590 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10590-00.patch > > > This jira is to fix the test case failure. Please see the below stacktrace. > Reference : > [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/] > {code} > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
[ https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-10590: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha1 Status: Resolved (was: Patch Available) I have committed this to trunk. Thanks Rakesh. > Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures > > > Key: HDFS-10590 > URL: https://issues.apache.org/jira/browse/HDFS-10590 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10590-00.patch > > > This jira is to fix the test case failure. Please see the below stacktrace. > Reference : > [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/] > {code} > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
[ https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374516#comment-15374516 ] Hudson commented on HDFS-10590: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10087 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10087/]) HDFS-10590: Fix TestReconstructStripedBlocks.testCountLiveReplicas test (uma.gangumalla: rev 438b7c5935f4314fd37916aee4369e67ec2887f8) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestReconstructStripedBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/StripedFileTestUtil.java > Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures > > > Key: HDFS-10590 > URL: https://issues.apache.org/jira/browse/HDFS-10590 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10590-00.patch > > > This jira is to fix the test case failure. Please see the below stacktrace. > Reference : > [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/] > {code} > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption
[ https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374480#comment-15374480 ] Yongjun Zhang commented on HDFS-10587: -- About the visibleLength, I saw {code} In ReplicaBeingWritten.java @Override public long getVisibleLength() { return getBytesAcked(); // all acked bytes are visible } {code} which means different replicas may have different visibleLength, because BytesAcked at different DataNodes maybe different. My earlier effort was to claim that using different visibleLength at the BlockReceiver than the BlockSender side is wrong. Based on the above code, it might be ok to claim the visibleLength as the received data length at the destination side of blockTransfer (better to get confirmation though). So, we need to understand, how the corruption really happened, and where in the block data: Did it happen when we receive this chunk of data, or when we receive new data after reconstructing the pipeline? Because based on my analysis so far, the skipping of the bytes on disk (mentioned in the following statement) is necessary since the data is not garbage (assuming the data at the Sender side is good). {quote} (8) When new data was appended to the destination, it skipped the bytes already on disk. Therefore, whatever was written as garbage was not replaced. {quote} One possibility is that the checksum handling there is not correct in a corner situation. If we have a testcase to replicate the issue, we need to look at both the source side data and destination side data, to see whether it's real data corruption, or checksum miscalculation. If there is corruption, where exactly the corruption is. > Incorrect offset/length calculation in pipeline recovery causes block > corruption > > > Key: HDFS-10587 > URL: https://issues.apache.org/jira/browse/HDFS-10587 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-10587.001.patch > > > We found incorrect offset and length calculation in pipeline recovery may > cause block corruption and results in missing blocks under a very unfortunate > scenario. > (1) A client established pipeline and started writing data to the pipeline. > (2) One of the data node in the pipeline restarted, closing the socket, and > some written data were unacknowledged. > (3) Client replaced the failed data node with a new one, initiating block > transfer to copy existing data in the block to the new datanode. > (4) The block is transferred to the new node. Crucially, the entire block, > including the unacknowledged data, was transferred. > (5) The last chunk (512 bytes) was not a full chunk, but the destination > still reserved the whole chunk in its buffer, and wrote the entire buffer to > disk, therefore some written data is garbage. > (6) When the transfer was done, the destination data node converted the > replica from temporary to rbw, which made its visible length as the length of > bytes on disk. That is to say, it thought whatever was transferred was > acknowledged. However, the visible length of the replica is different (round > up to the next multiple of 512) than the source of transfer. [1] > (7) Client then truncated the block in the attempt to remove unacknowledged > data. However, because the visible length is equivalent of the bytes on disk, > it did not truncate unacknowledged data. > (8) When new data was appended to the destination, it skipped the bytes > already on disk. Therefore, whatever was written as garbage was not replaced. > (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it > wouldn’t tell NameNode to mark the replica as corrupt, so the client > continued to form a pipeline using the corrupt replica. > (10) Finally the DN that had the only healthy replica was restarted. NameNode > then update the pipeline to only contain the corrupt replica. > (11) Client continue to write to the corrupt replica, because neither client > nor the data node itself knows the replica is corrupt. When the restarted > datanodes comes back, their replica are stale, despite they are not corrupt. > Therefore, none of the replica is good and up to date. > The sequence of events was reconstructed based on DataNode/NameNode log and > my understanding of code. > Incidentally, we have observed the same sequence of events on two independent > clusters. > [1] > The sender has the replica as follows: > 2016-04-15 22:03:05,066 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: > Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW > getNumBytes() = 41381376 > getBytesOnDisk() = 41381376 >
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374470#comment-15374470 ] Zhe Zhang commented on HDFS-10544: -- Committed v5 patch to trunk, branch-2, and branch-2.8. I'm working on resolving branch-2.7 conflicts. > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, > HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, > HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Fix Version/s: 3.0.0-alpha1 2.9.0 2.8.0 > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1 > > Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, > HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, > HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374461#comment-15374461 ] Hudson commented on HDFS-10544: --- SUCCESS: Integrated in Hadoop-trunk-Commit #10086 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10086/]) HDFS-10544. Balancer doesn't work with IPFailoverProxyProvider. (zhz: rev 087290e6b1cb1082646d966b65494082712ebe3e) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, > HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, > HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured
[ https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10606: --- Affects Version/s: (was: 2.7.1) 2.7.0 Status: Patch Available (was: Open) > TrashPolicyDefault supports time of auto clean up can configured > > > Key: HDFS-10606 > URL: https://issues.apache.org/jira/browse/HDFS-10606 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.7.0 >Reporter: He Xiaoqiao > Attachments: HDFS-10606-branch-2.7.001.patch > > > TrashPolicyDefault clean up Trash based on > [UTC|http://www.worldtimeserver.com/current_time_in_UTC.aspx] currently and > the time of cleaning up is 00:00 UTC. when there are large amount of trash > data should be auto-clean, it will block NN for a long time since Global > Lock, In the most serious situations it may lead some cron job submit > failure. if add configuration about time of cleaning up, it will avoid impact > on this cron jobs at that default time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured
[ https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Xiaoqiao updated HDFS-10606: --- Attachment: HDFS-10606-branch-2.7.001.patch submit patch for branch-2.7 > TrashPolicyDefault supports time of auto clean up can configured > > > Key: HDFS-10606 > URL: https://issues.apache.org/jira/browse/HDFS-10606 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.7.1 >Reporter: He Xiaoqiao > Attachments: HDFS-10606-branch-2.7.001.patch > > > TrashPolicyDefault clean up Trash based on > [UTC|http://www.worldtimeserver.com/current_time_in_UTC.aspx] currently and > the time of cleaning up is 00:00 UTC. when there are large amount of trash > data should be auto-clean, it will block NN for a long time since Global > Lock, In the most serious situations it may lead some cron job submit > failure. if add configuration about time of cleaning up, it will avoid impact > on this cron jobs at that default time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10544: - Attachment: HDFS-10544.05.patch Thanks [~shv] for the review! Attaching v5 patch which fixes 2 checkstyle issues in v4 (lines in {{TestDFSUtil}} too long). The other 2 checkstyle issues are inherent with the original code style. The reported test failures are unrelated and cannot be reproduced locally. I'll commit v5 patch shortly. > Balancer doesn't work with IPFailoverProxyProvider > -- > > Key: HDFS-10544 > URL: https://issues.apache.org/jira/browse/HDFS-10544 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, > HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, > HDFS-10544.05.patch > > > Right now {{Balancer}} gets the NN URIs through > {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. > If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to > start. > I think the bug is at {{DFSUtil#getNameServiceUris}}: > {code} > for (String nsId : getNameServiceIds(conf)) { > if (HAUtil.isHAEnabled(conf, nsId)) { > // Add the logical URI of the nameservice. > try { > ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId)); > {code} > Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has > {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to > resolve the physical URI for this nsId. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org