[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8058: Attachment: HDFS-8058-HDFS-7285.007.patch Thanks Jing and Yi for the suggestions! Updating the patch. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625959#comment-14625959 ] Kai Sasaki commented on HDFS-8702: -- [~walter.k.su] Thank you so much! Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625996#comment-14625996 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 0s | Pre-patch HDFS-7285 has 5 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:green}+1{color} | javac | 7m 31s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 7s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 22s | The applied patch generated 8 new checkstyle issues (total was 335, now 329). | | {color:green}+1{color} | whitespace | 0m 7s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 22s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 32m 11s | Tests failed in hadoop-hdfs. | | | | 79m 43s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestReservedRawPaths | | | hadoop.hdfs.server.blockmanagement.TestDatanodeManager | | | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots | | | hadoop.hdfs.TestModTime | | | hadoop.fs.TestUrlStreamHandler | | | hadoop.hdfs.security.TestDelegationToken | | | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolarent | | | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | | | hadoop.hdfs.server.namenode.TestFileLimit | | | hadoop.hdfs.TestParallelShortCircuitRead | | | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot | | | hadoop.hdfs.TestDisableConnCache | | | hadoop.hdfs.server.blockmanagement.TestBlockInfoStriped | | | hadoop.hdfs.server.namenode.TestEditLogAutoroll | | | hadoop.TestRefreshCallQueue | | | hadoop.hdfs.protocolPB.TestPBHelper | | | hadoop.hdfs.TestECSchemas | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.hdfs.TestConnCache | | | hadoop.cli.TestCryptoAdminCLI | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestSetrepDecreasing | | | hadoop.hdfs.server.datanode.TestDiskError | | | hadoop.fs.viewfs.TestViewFsWithAcls | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.namenode.TestAddStripedBlocks | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.TestHostsFiles | | | hadoop.hdfs.server.datanode.TestTransferRbw | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy | | | hadoop.fs.contract.hdfs.TestHDFSContractDelete | | | hadoop.hdfs.server.namenode.TestFileContextAcl | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.fs.TestFcHdfsSetUMask | | | hadoop.fs.TestUnbuffer | | | hadoop.hdfs.server.namenode.TestClusterId | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.TestFSDirectory | | | hadoop.hdfs.server.namenode.TestLeaseManager | | | hadoop.fs.contract.hdfs.TestHDFSContractOpen | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotListing | | | hadoop.hdfs.server.datanode.TestStorageReport | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.TestReadWhileWriting | | | hadoop.fs.contract.hdfs.TestHDFSContractMkdir | | | hadoop.fs.contract.hdfs.TestHDFSContractAppend | | | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA | | | hadoop.hdfs.server.namenode.ha.TestGetGroupsWithHA | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.TestSecondaryWebUi | | | hadoop.hdfs.server.namenode.TestMalformedURLs | | | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.TestWriteBlockGetsBlockLengthHint | | | hadoop.hdfs.TestDatanodeLayoutUpgrade | | |
[jira] [Updated] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-8702: - Attachment: HDFS-8702-HDFS-7285.03.patch Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8728) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625925#comment-14625925 ] Walter Su commented on HDFS-8728: - There is already a jira related to EC zones unpacking. (HDFS-8594) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile --- Key: HDFS-8728 URL: https://issues.apache.org/jira/browse/HDFS-8728 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8728-HDFS-7285.00.patch, HDFS-8728-HDFS-7285.01.patch, HDFS-8728.00.patch, HDFS-8728.01.patch, HDFS-8728.02.patch, Merge-1-codec.patch, Merge-2-ecZones.patch, Merge-3-blockInfo.patch, Merge-4-blockmanagement.patch, Merge-5-blockPlacementPolicies.patch, Merge-6-locatedStripedBlock.patch, Merge-7-replicationMonitor.patch, Merge-8-inodeFile.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625956#comment-14625956 ] Walter Su commented on HDFS-8702: - 04 patch LGTM. Let's wait util tomorrow to see if Jing Zhao has any comments. Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625913#comment-14625913 ] Walter Su commented on HDFS-8702: - 1. compare expectedStorageNum with 1 is pointless. Please just add hasClusterEverBeenMultiRack() to the head of the function. 2. The meaning of @param, @return is clear enough. So they can be removed. 3. Don't check the code style of function for contiguous block. I doesn't mean it's wrong. Just that it increases feature branch patch size and make merging hard. If this jira is against trunk, I'm 100% support. Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8768) The display of Erasure Code file block group ID in WebUI is not consistent with fsck command
GAO Rui created HDFS-8768: - Summary: The display of Erasure Code file block group ID in WebUI is not consistent with fsck command Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui For example, In WebUI( http://[namenode address]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: [[0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640]] After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-8702: - Attachment: HDFS-8702-HDFS-7285.04.patch Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-8578: Attachment: HDFS-8578-06.patch Attached patch with some more test fixes On upgrade, Datanode should process all storage/data dirs in parallel - Key: HDFS-8578 URL: https://issues.apache.org/jira/browse/HDFS-8578 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Raju Bairishetti Priority: Critical Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-branch-2.6.0.patch Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. *BlockPoolSliceStorage.java* {code} for (int idx = 0; idx getNumStorageDirs(); idx++) { doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); assert getCTime() == nsInfo.getCTime() : Data-node and name-node CTimes must be the same.; } {code} It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B reassigned HDFS-8578: --- Assignee: Vinayakumar B On upgrade, Datanode should process all storage/data dirs in parallel - Key: HDFS-8578 URL: https://issues.apache.org/jira/browse/HDFS-8578 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Raju Bairishetti Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-branch-2.6.0.patch Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. *BlockPoolSliceStorage.java* {code} for (int idx = 0; idx getNumStorageDirs(); idx++) { doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); assert getCTime() == nsInfo.getCTime() : Data-node and name-node CTimes must be the same.; } {code} It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8770) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
ade created HDFS-8770: - Summary: ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates Key: HDFS-8770 URL: https://issues.apache.org/jira/browse/HDFS-8770 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0, 2.6.0 Reporter: ade Assignee: ade Priority: Critical Namenode shutdown when ReplicationMonitor thread received Runtime exception: {quote} 2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628) at java.lang.Thread.run(Thread.java:744) {quote} We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete return null because only moreThanOne set be chosen to find SSD replica -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8728) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625903#comment-14625903 ] Zhe Zhang commented on HDFS-8728: - Thanks [~andrew.wang] for the thorough review! # {{getOp}} is a good idea to simplify code. Will include in the next rev. # Agreed on {{BIUCS#setTruncateBlock}} and saving reference in {{StripedBlockStorageOp}}. # HDFS-8032 is the follow-on task to optimize indices memory usage. # Good points on EC zones unpacking, will address in next rev. # [~drankye] could you take a look at the {{SchemaLoader}} comment? Erasure coding: revisit and simplify BlockInfoStriped and INodeFile --- Key: HDFS-8728 URL: https://issues.apache.org/jira/browse/HDFS-8728 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8728-HDFS-7285.00.patch, HDFS-8728-HDFS-7285.01.patch, HDFS-8728.00.patch, HDFS-8728.01.patch, HDFS-8728.02.patch, Merge-1-codec.patch, Merge-2-ecZones.patch, Merge-3-blockInfo.patch, Merge-4-blockmanagement.patch, Merge-5-blockPlacementPolicies.patch, Merge-6-locatedStripedBlock.patch, Merge-7-replicationMonitor.patch, Merge-8-inodeFile.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8734) Erasure Coding: fix one cell need two packets
[ https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625900#comment-14625900 ] Walter Su commented on HDFS-8734: - Thanks [~libo-intel] for review! About #1: I copy the logic from FSOutputSummer#writeChekcsumChunks(..). I think we should keep consistent of parity and data patterns of calling writeChunk(..). About #2: currentPackets[] are null at first. I only backup the currentPacket for the swapped out streamer. I doesn't backup the currentPacket for the swapped in streamer. For example: streamer#1, #2 has the job done. Now streamer #3 swapped in. I backup the #2 currentPacket. and get the #3 currentPacket from array. #3 streamer create packet_0, packet_1, .. packet_N. Because packet 0~(N-1) is finished. So I only have to backup packet_N for #3. Then it's ok to switch to streamer #4. For the next time. I checkout packet_N, which is unfinished for streamer #3. Erasure Coding: fix one cell need two packets - Key: HDFS-8734 URL: https://issues.apache.org/jira/browse/HDFS-8734 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch The default WritePacketSize is 64k Currently default cellSize is 64k We hope one cell consumes one packet. In fact it's not. By default, chunkSize = 516( 512 data + 4 checksum) packetSize = 64k chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for details) numBytes of data in one packet = 64512 cellSize = 65536 When first packet is full ( with 64512 data), there are still 65536 - 64512 = 1024 bytes left. {code} super.writeChunk(bytes, offset, len, checksum, ckoff, cklen); // cell is full and current packet has not been enqueued, if (cellFull currentPacket != null) { enqueueCurrentPacketFull(); } {code} When the last 1024 bytes of the cell was written, we meet {{cellFull}} and create another packet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8768) The display of Erasure Code file block group ID in WebUI is not consistent with fsck command
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-8768: -- Description: For example, In WebUI( http://[namenode address]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. was: For example, In WebUI( http://[namenode address]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: [[0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640]] After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. The display of Erasure Code file block group ID in WebUI is not consistent with fsck command Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui For example, In WebUI( http://[namenode address]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8768) The display of Erasure Code file block group ID in WebUI is not consistent with fsck command
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-8768: -- Attachment: Screen Shot 2015-07-14 at 15.33.08.png The display of Erasure Code file block group ID in WebUI is not consistent with fsck command Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui Attachments: Screen Shot 2015-07-14 at 15.33.08.png For example, In WebUI( http://[namenodeaddress]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7314) When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClient
[ https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625950#comment-14625950 ] Hadoop QA commented on HDFS-7314: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 4s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 33s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 20s | The applied patch generated 2 new checkstyle issues (total was 137, now 137). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 32s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 17s | Tests failed in hadoop-hdfs. | | | | 203m 50s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745179/HDFS-7314-9.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / a431ed9 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11692/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11692/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11692/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11692/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11692/console | This message was automatically generated. When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClient --- Key: HDFS-7314 URL: https://issues.apache.org/jira/browse/HDFS-7314 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Labels: BB2015-05-TBR Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, HDFS-7314-5.patch, HDFS-7314-6.patch, HDFS-7314-7.patch, HDFS-7314-8.patch, HDFS-7314-9.patch, HDFS-7314.patch It happened in YARN nodemanger scenario. But it could happen to any long running service that use cached instance of DistrbutedFileSystem. 1. Active NN is under heavy load. So it became unavailable for 10 minutes; any DFSClient request will get ConnectTimeoutException. 2. YARN nodemanager use DFSClient for certain write operation such as log aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's renewLease RPC got ConnectTimeoutException. {noformat} 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds. Aborting ... {noformat} 3. After DFSClient is in Aborted state, YARN NM can't use that cached instance of DistributedFileSystem. {noformat} 2014-10-29 20:26:23,991 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc... java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at
[jira] [Updated] (HDFS-8768) The display of Erasure Code file block group ID in WebUI is not consistent with fsck command
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-8768: -- Description: For example, In WebUI( usually, namenode port: 50070) , one Erasure Code file with one block group was displayed as the attached screenshot [^Screen Shot 2015-07-14 at 15.33.08.png]. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. was: For example, In WebUI( http://[namenodeaddress]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. The display of Erasure Code file block group ID in WebUI is not consistent with fsck command Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui Attachments: Screen Shot 2015-07-14 at 15.33.08.png For example, In WebUI( usually, namenode port: 50070) , one Erasure Code file with one block group was displayed as the attached screenshot [^Screen Shot 2015-07-14 at 15.33.08.png]. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8768) The display of Erasure Code file block group ID in WebUI is not consistent with fsck command
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-8768: -- Description: For example, In WebUI( http://[namenodeaddress]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. was: For example, In WebUI( http://[namenode address]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. The display of Erasure Code file block group ID in WebUI is not consistent with fsck command Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui Attachments: Screen Shot 2015-07-14 at 15.33.08.png For example, In WebUI( http://[namenodeaddress]:50070) , one Erasure Code file with one block group was displayed as the attached screenshot. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8728) Erasure coding: revisit and simplify BlockInfoStriped and INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625927#comment-14625927 ] Kai Zheng commented on HDFS-8728: - Thanks for the great work here! About the schema loader related comments: bq.SchemaLoader, is this class used anywhere? I didn't see any usages. Yes it's not used in current codes. There were some discussion that we may not need an XML schema file and also, support multiple configurable schemas was left for follow-on stages, the reference codes had been removed but remaining it. We'll resume related discussion and then can decide the schema loader should be also removed or not finally. bq. SchemaLoader should also be made a static nested class, e.g. SchemaLoader.Loader, since there's no state. Sounds good. When we can decide it will be needed anyway after follow-on discussion, we can revisit and improve it as suggested. Erasure coding: revisit and simplify BlockInfoStriped and INodeFile --- Key: HDFS-8728 URL: https://issues.apache.org/jira/browse/HDFS-8728 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8728-HDFS-7285.00.patch, HDFS-8728-HDFS-7285.01.patch, HDFS-8728.00.patch, HDFS-8728.01.patch, HDFS-8728.02.patch, Merge-1-codec.patch, Merge-2-ecZones.patch, Merge-3-blockInfo.patch, Merge-4-blockmanagement.patch, Merge-5-blockPlacementPolicies.patch, Merge-6-locatedStripedBlock.patch, Merge-7-replicationMonitor.patch, Merge-8-inodeFile.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8769) unit test for SequentialBlockGroupIdGenerator
Walter Su created HDFS-8769: --- Summary: unit test for SequentialBlockGroupIdGenerator Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625995#comment-14625995 ] Surendra Singh Lilhore commented on HDFS-8541: -- Thanks [~szetszwo] for review and commit Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8770) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
[ https://issues.apache.org/jira/browse/HDFS-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ade updated HDFS-8770: -- Status: Open (was: Patch Available) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates --- Key: HDFS-8770 URL: https://issues.apache.org/jira/browse/HDFS-8770 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0, 2.6.0 Reporter: ade Assignee: ade Priority: Critical Namenode shutdown when ReplicationMonitor thread received Runtime exception: {quote} 2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628) at java.lang.Thread.run(Thread.java:744) {quote} We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete return null because only moreThanOne set be chosen to find SSD replica -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8770) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
[ https://issues.apache.org/jira/browse/HDFS-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ade updated HDFS-8770: -- Status: Patch Available (was: Open) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates --- Key: HDFS-8770 URL: https://issues.apache.org/jira/browse/HDFS-8770 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0, 2.6.0 Reporter: ade Assignee: ade Priority: Critical Namenode shutdown when ReplicationMonitor thread received Runtime exception: {quote} 2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628) at java.lang.Thread.run(Thread.java:744) {quote} We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete return null because only moreThanOne set be chosen to find SSD replica -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8770) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
[ https://issues.apache.org/jira/browse/HDFS-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ade updated HDFS-8770: -- Attachment: HDFS-8770_v1.patch ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates --- Key: HDFS-8770 URL: https://issues.apache.org/jira/browse/HDFS-8770 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0, 2.7.0 Reporter: ade Assignee: ade Priority: Critical Attachments: HDFS-8770_v1.patch Namenode shutdown when ReplicationMonitor thread received Runtime exception: {quote} 2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628) at java.lang.Thread.run(Thread.java:744) {quote} We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete return null because only moreThanOne set be chosen to find SSD replica -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626128#comment-14626128 ] Brahma Reddy Battula commented on HDFS-8772: [~walter.k.su]Thanks for reporting.Nice work .. Patch,LGTM +1 ( non-binding) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8772.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626159#comment-14626159 ] Hadoop QA commented on HDFS-8702: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 58s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 41s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 41s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 24s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 151m 20s | Tests failed in hadoop-hdfs. | | | | 194m 27s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745206/HDFS-8702-HDFS-7285.04.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / b1e6429 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11696/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11696/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11696/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11696/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11696/console | This message was automatically generated. Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626184#comment-14626184 ] Steve Loughran commented on HDFS-8767: -- this'll need a test for unix which at least downgrades on windows RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8771) If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not send another RPC calls to Journalnodes
Takuya Fukudome created HDFS-8771: - Summary: If IPCLoggerChannel#purgeLogsOlderThan takes too long, Namenode could not send another RPC calls to Journalnodes Key: HDFS-8771 URL: https://issues.apache.org/jira/browse/HDFS-8771 Project: Hadoop HDFS Issue Type: Bug Reporter: Takuya Fukudome In our cluster, edits has became huge(about 50GB) accidentally and our Jounalnodes' disks were busy, therefore {{purgeLogsOlderThan}} took more than 30secs. If {{IPCLoggerChannel#purgeLogsOlderThan}} takes too much time, Namenode couldn't send other RPC calls to Journalnodes because {{o.a.h.hdfs.qjournal.client.IPCLoggerChannel}}'s executor is single thread. It will cause namenode shutting down. I think IPCLoggerChannel#purgeLogsOlderThan should not block other RPC calls like sendEdits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
Walter Su created HDFS-8772: --- Summary: fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-8767: -- Status: Patch Available (was: Open) As per java.io.File documentation, file.isFile() file.isDirectory() return false and file.list() return null for non regular files like pipes. I this case {{RawLocalFileSystem.listStatus()}} returning null. So, we can check if the file is of type other and return the status Attached patch for fix RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626168#comment-14626168 ] Hadoop QA commented on HDFS-8578: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 1s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 37s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 158m 41s | Tests failed in hadoop-hdfs. | | | | 201m 29s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestReplication | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745204/HDFS-8578-06.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / a431ed9 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11695/console | This message was automatically generated. On upgrade, Datanode should process all storage/data dirs in parallel - Key: HDFS-8578 URL: https://issues.apache.org/jira/browse/HDFS-8578 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Raju Bairishetti Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-branch-2.6.0.patch Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. *BlockPoolSliceStorage.java* {code} for (int idx = 0; idx getNumStorageDirs(); idx++) { doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); assert getCTime() == nsInfo.getCTime() : Data-node and name-node CTimes must be the same.; } {code} It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-8767: -- Attachment: HDFS-8767-00.patch RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8702) Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block
[ https://issues.apache.org/jira/browse/HDFS-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626142#comment-14626142 ] Hadoop QA commented on HDFS-8702: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 3s | Pre-patch HDFS-7285 has 5 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 18s | The applied patch generated 1 new checkstyle issues (total was 204, now 202). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 26s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 17s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 173m 45s | Tests failed in hadoop-hdfs. | | | | 220m 35s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745194/HDFS-8702-HDFS-7285.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / b1e6429 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/artifact/patchprocess/HDFS-7285FindbugsWarningshadoop-hdfs.html | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11693/console | This message was automatically generated. Erasure coding: update BlockManager.blockHasEnoughRacks(..) logic for striped block --- Key: HDFS-8702 URL: https://issues.apache.org/jira/browse/HDFS-8702 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Kai Sasaki Attachments: HDFS-8702-HDFS-7285.00.patch, HDFS-8702-HDFS-7285.01.patch, HDFS-8702-HDFS-7285.02.patch, HDFS-8702-HDFS-7285.03.patch, HDFS-8702-HDFS-7285.04.patch Currently blockHasEnoughRacks(..) only guarantees 2 racks. Logic needs updated for striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8772: Attachment: HDFS-8772.01.patch fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8772.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8772: Status: Patch Available (was: Open) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8772.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8769) unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8769: Assignee: Rakesh R unit test for SequentialBlockGroupIdGenerator - Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Rakesh R -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626117#comment-14626117 ] Walter Su commented on HDFS-8772: - *causes* cluster.waitActive() doesn't wait first BR to finish. *How to reproduce quickly* {code} @@ -140,6 +151,7 @@ public void testDatanodeRestarts() throws Exception { HAUtil.setAllowStandbyReads(conf, true); conf.setLong(DFSConfigKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY, 0); conf.setInt(DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, 1); +conf.setLong(DFS_BLOCKREPORT_INITIAL_DELAY_KEY, 200); MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf) {code} fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626209#comment-14626209 ] Hadoop QA commented on HDFS-8767: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 11s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 7s | The applied patch generated 1 new checkstyle issues (total was 21, now 21). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 53s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 4s | Tests failed in hadoop-common. | | | | 62m 44s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.crypto.key.TestValueQueue | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745231/HDFS-8767-00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/diffcheckstylehadoop-common.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11698/console | This message was automatically generated. RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page
Rakesh R created HDFS-8773: -- Summary: Few FSNamesystem metrics are not documented in the Metrics page Key: HDFS-8773 URL: https://issues.apache.org/jira/browse/HDFS-8773 Project: Hadoop HDFS Issue Type: Bug Components: documentation Reporter: Rakesh R Assignee: Rakesh R This jira is to document missing metrics in the [Metrics page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem]. Following are not documented: {code} MissingReplOneBlocks NumFilesUnderConstruction NumActiveClients HAState FSState {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626239#comment-14626239 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/986/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626241#comment-14626241 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/986/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8772) fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-8772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626312#comment-14626312 ] Hadoop QA commented on HDFS-8772: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 5m 40s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 7s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 28s | Tests failed in hadoop-hdfs. | | | | 180m 16s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745227/HDFS-8772.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / ac94ba3 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11697/console | This message was automatically generated. fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails Key: HDFS-8772 URL: https://issues.apache.org/jira/browse/HDFS-8772 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8772.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11598/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11600/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11606/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11608/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11612/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11618/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11650/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11655/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11659/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11663/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11664/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11667/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11669/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11676/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11677/testReport/ {noformat} java.lang.AssertionError: expected:0 but was:4 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot.testDatanodeRestarts(TestStandbyIsHot.java:188) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8475) Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available
[ https://issues.apache.org/jira/browse/HDFS-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626288#comment-14626288 ] Vinod Valecha commented on HDFS-8475: - Hi Team, Could this be a configuration issue for hadoop. Can you pls point us to the configuration that we should be looking at in order to correct this. Thanks a lot! Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available Key: HDFS-8475 URL: https://issues.apache.org/jira/browse/HDFS-8475 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Vinod Valecha Priority: Blocker Scenraio: = write a file corrupt block manually Exception stack trace- 2015-05-24 02:31:55.291 INFO [T-33716795] [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514) [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer createBlockOutputStream Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1155) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514) 2015-05-24 02:31:55.291 INFO [T-33716795] [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Abandoning BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579 [5/24/15 2:31:55:291 UTC] 02027a3b DFSClient I org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream Abandoning BP-176676314-10.108.106.59-1402620296713:blk_1404621403_330880579 2015-05-24 02:31:55.299 INFO [T-33716795] [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] Excluding datanode 10.108.106.59:50010 [5/24/15 2:31:55:299 UTC] 02027a3b DFSClient I org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream Excluding datanode 10.108.106.59:50010 2015-05-24 02:31:55.300 WARNING [T-33716795] [org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer] DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) [5/24/15 2:31:55:300 UTC] 02027a3b DFSClient W org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /var/db/opera/files/B4889CCDA75F9751DDBB488E5AAB433E/BE4DAEF290B7136ED6EF3D4B157441A2/BE4DAEF290B7136ED6EF3D4B157441A2-4.pag could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) 2015-05-24 02:31:55.301 WARNING [T-880]
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626231#comment-14626231 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626228#comment-14626228 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-8578: Attachment: HDFS-8578-07.patch Fixed {{TestReplication}} which fails intermittently, not exactly related to this Jira. {{TestHDFSCLI}} failed due to collision with hadoop-common precommit job. {{TestDataNodeRollingUpgrade}} passes locally. On upgrade, Datanode should process all storage/data dirs in parallel - Key: HDFS-8578 URL: https://issues.apache.org/jira/browse/HDFS-8578 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Raju Bairishetti Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. *BlockPoolSliceStorage.java* {code} for (int idx = 0; idx getNumStorageDirs(); idx++) { doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); assert getCTime() == nsInfo.getCTime() : Data-node and name-node CTimes must be the same.; } {code} It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-8767: -- Attachment: HDFS-8767-01.patch RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Attachment: HDFS-8736-1.patch ability to deny access to HDFS filesystems -- Key: HDFS-8736 URL: https://issues.apache.org/jira/browse/HDFS-8736 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.5.0 Reporter: Purvesh Patel Priority: Minor Labels: security Attachments: HDFS-8736-1.patch In order to run in a secure context, ability to deny access to different filesystems(specifically the local file system) to non-trusted code this patch adds a new SecurityPermission class(AccessFileSystemPermission) and checks the permission in FileSystem#get before returning a cached file system or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel
[ https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626472#comment-14626472 ] Hadoop QA commented on HDFS-8578: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 57s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 1 new checkstyle issues (total was 597, now 592). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 1s | Tests failed in hadoop-hdfs. | | | | 204m 22s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745242/HDFS-8578-07.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11699/console | This message was automatically generated. On upgrade, Datanode should process all storage/data dirs in parallel - Key: HDFS-8578 URL: https://issues.apache.org/jira/browse/HDFS-8578 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Raju Bairishetti Assignee: Vinayakumar B Priority: Critical Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-branch-2.6.0.patch Right now, during upgrades datanode is processing all the storage dirs sequentially. Assume it takes ~20 mins to process a single storage dir then datanode which has ~10 disks will take around 3hours to come up. *BlockPoolSliceStorage.java* {code} for (int idx = 0; idx getNumStorageDirs(); idx++) { doTransition(datanode, getStorageDir(idx), nsInfo, startOpt); assert getCTime() == nsInfo.getCTime() : Data-node and name-node CTimes must be the same.; } {code} It would save lots of time during major upgrades if datanode process all storagedirs/disks parallelly. Can we make datanode to process all storage dirs parallelly? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewe
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626381#comment-14626381 ] kanaka kumar avvaru commented on HDFS-8622: --- Thanks for update [~jagadesh.kiran], -02.patch looks good for me. +1 (Non binding). [~ajisakaa], please give your view. Implement GETCONTENTSUMMARY operation for WebImageViewe --- Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626387#comment-14626387 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626385#comment-14626385 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626397#comment-14626397 ] kanaka kumar avvaru commented on HDFS-8767: --- Updated patch with the test case on UNIX based systems as per [~ste...@apache.org] comment RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Attachment: (was: Patch.pdf) ability to deny access to HDFS filesystems -- Key: HDFS-8736 URL: https://issues.apache.org/jira/browse/HDFS-8736 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.5.0 Reporter: Purvesh Patel Priority: Minor Labels: security Attachments: HDFS-8736-1.patch In order to run in a secure context, ability to deny access to different filesystems(specifically the local file system) to non-trusted code this patch adds a new SecurityPermission class(AccessFileSystemPermission) and checks the permission in FileSystem#get before returning a cached file system or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Purvesh Patel updated HDFS-8736: Summary: ability to deny access to HDFS filesystems (was: ability to deny access to different filesystems) ability to deny access to HDFS filesystems -- Key: HDFS-8736 URL: https://issues.apache.org/jira/browse/HDFS-8736 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.5.0 Reporter: Purvesh Patel Priority: Minor Labels: security Attachments: HDFS-8736-1.patch In order to run in a secure context, ability to deny access to different filesystems(specifically the local file system) to non-trusted code this patch adds a new SecurityPermission class(AccessFileSystemPermission) and checks the permission in FileSystem#get before returning a cached file system or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626450#comment-14626450 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626452#comment-14626452 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626463#comment-14626463 ] Purvesh Patel commented on HDFS-8736: - There is little confusion on the description of issue. This patch is introduced to prevent untrusted user code from accessing to HDFS, not the local file system. It's written in such a way as to potentially enable it to be used to block access to any type of FileSystem, with the caveat that you'd need to also guard against users trying to instantiate the file system implementation directly using other permissions. Additional permission to prevent users from getting access to instances of the HDFS FileSystem that were created when the user code was off-stack and that have pre-cached network connections. ability to deny access to HDFS filesystems -- Key: HDFS-8736 URL: https://issues.apache.org/jira/browse/HDFS-8736 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.5.0 Reporter: Purvesh Patel Priority: Minor Labels: security Attachments: HDFS-8736-1.patch In order to run in a secure context, ability to deny access to different filesystems(specifically the local file system) to non-trusted code this patch adds a new SecurityPermission class(AccessFileSystemPermission) and checks the permission in FileSystem#get before returning a cached file system or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626500#comment-14626500 ] Hadoop QA commented on HDFS-8767: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 26s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 7s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 51s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 4s | Tests passed in hadoop-common. | | | | 61m 3s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745256/HDFS-8767-01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4084eaf | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11700/console | This message was automatically generated. RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626492#comment-14626492 ] Hudson commented on HDFS-8143: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626490#comment-14626490 ] Hudson commented on HDFS-8541: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8143) HDFS Mover tool should exit after some retry when failed to move blocks.
[ https://issues.apache.org/jira/browse/HDFS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626403#comment-14626403 ] Hudson commented on HDFS-8143: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/]) Add HDFS-8143 to CHANGES.txt. (szetszwo: rev f7c8311e9836ad1a1a2ef6eca8b42fd61a688164) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS Mover tool should exit after some retry when failed to move blocks. Key: HDFS-8143 URL: https://issues.apache.org/jira/browse/HDFS-8143 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.7.1 Attachments: HDFS-8143.patch, HDFS-8143_1.patch, HDFS-8143_2.patch, HDFS-8143_3.patch Mover is not coming out in case of failed to move blocks. {code} hasRemaining |= Dispatcher.waitForMoveCompletion(storages.targets.values()); {code} {{Dispatcher.waitForMoveCompletion()}} will always return true if some blocks migration failed. So hasRemaining never become false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8541) Mover should exit with NO_MOVE_PROGRESS if there is no move progress
[ https://issues.apache.org/jira/browse/HDFS-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626401#comment-14626401 ] Hudson commented on HDFS-8541: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/]) HDFS-8541. Mover should exit with NO_MOVE_PROGRESS if there is no move progress. Contributed by Surendra Singh Lilhore (szetszwo: rev 9ef03a4c5bb5573eadc7d04e371c4af2dc6bae37) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java Mover should exit with NO_MOVE_PROGRESS if there is no move progress Key: HDFS-8541 URL: https://issues.apache.org/jira/browse/HDFS-8541 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8541.patch, HDFS-8541_1.patch, HDFS-8541_2.patch HDFS-8143 changed Mover to exit after some retry when failed to move blocks. Two additional suggestions: # Mover retry counter should be incremented only if all moves fail. If there are some successful moves, the counter should be reset. # Mover should exit with NO_MOVE_PROGRESS instead of IO_EXCEPTION in case of failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile
[ https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626625#comment-14626625 ] Haohui Mai commented on HDFS-8767: -- It looks like that a cleaner approach is to call {{list()}} only when the file is a directory. RawLocalFileSystem.listStatus() returns null for UNIX pipefile -- Key: HDFS-8767 URL: https://issues.apache.org/jira/browse/HDFS-8767 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: kanaka kumar avvaru Priority: Critical Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of the file. The bug breaks Hive when Hive loads data from UNIX pipe file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to HDFS filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626633#comment-14626633 ] Allen Wittenauer commented on HDFS-8736: Trying to solve server security problems from the client side never works. bq. with the caveat that you'd need to also guard against users trying to instantiate the file system implementation directly using other permissions. ... which is nearly impossible. There's not a lot of work here to do exactly that: java -Dfs.hdfs.impl=myclass or java -Dfs.s3.impl=DistributedFileSystem or whatever Now what? ability to deny access to HDFS filesystems -- Key: HDFS-8736 URL: https://issues.apache.org/jira/browse/HDFS-8736 Project: Hadoop HDFS Issue Type: Improvement Components: security Affects Versions: 2.5.0 Reporter: Purvesh Patel Priority: Minor Labels: security Attachments: HDFS-8736-1.patch In order to run in a secure context, ability to deny access to different filesystems(specifically the local file system) to non-trusted code this patch adds a new SecurityPermission class(AccessFileSystemPermission) and checks the permission in FileSystem#get before returning a cached file system or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626642#comment-14626642 ] Allen Wittenauer commented on HDFS-8344: +1 NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626645#comment-14626645 ] Zhe Zhang commented on HDFS-8058: - Triggering Jenkins again. Last run generated a lot of Class not found errors. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626596#comment-14626596 ] Walter Su commented on HDFS-8433: - I have another idea: Pick up the {{BlockIdRange}} idea again in 01 patch. This time, we don't need {{BlockIdRange}} class. We extends the fields of {{BlockTokenIdentifier}}. (See BlockTokenIdentifier#readFields(..) / write(..) ) Just add a field {{IdRange}}, default value is 0. I think the performance impact on contiguous block is little. And I think it also support old DN, old DN just doesn't read the last field. I prefer to implement this against trunk. And just merge 02 patch(multiple tokens method) into feature branch and see how it works. Then we decide if it's worth to pick up BlockIdRange. Hi, [~jingzhao], [~szetszwo]! Any idea? blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil -- Key: HDFS-8433 URL: https://issues.apache.org/jira/browse/HDFS-8433 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Walter Su Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, HDFS-8433.01.patch The blockToken provided in LocatedStripedBlock is not used to create LocatedBlock in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil. We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626810#comment-14626810 ] Jing Zhao commented on HDFS-8058: - The 007 patch looks good to me. Just two minors: # In {{INodeFileAttributes}}, it's better to keep the {{isStriped}} attribute of the snapshot copy the same with the INodeFile. Thus maybe we can add a boolean parameter to {{INodeFileAttributes.SnapshotCopy}}'s constructor, and in {{FSImageFormatPBSnapshot#loadFileDiffList}} we pass in {{file.isStriped}}. {code} - header = HeaderFormat.toLong(preferredBlockSize, replication, + header = HeaderFormat.toLong(preferredBlockSize, replication, false, storagePolicyID); {code} # Looks like we do not need to add a new constructor for INodeFile. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8734) Erasure Coding: fix one cell need two packets
[ https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8734: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) I've committed this to the feature branch. Thanks for the contribution, Walter! Thanks for the review, Bo! Erasure Coding: fix one cell need two packets - Key: HDFS-8734 URL: https://issues.apache.org/jira/browse/HDFS-8734 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Fix For: HDFS-7285 Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch The default WritePacketSize is 64k Currently default cellSize is 64k We hope one cell consumes one packet. In fact it's not. By default, chunkSize = 516( 512 data + 4 checksum) packetSize = 64k chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for details) numBytes of data in one packet = 64512 cellSize = 65536 When first packet is full ( with 64512 data), there are still 65536 - 64512 = 1024 bytes left. {code} super.writeChunk(bytes, offset, len, checksum, ckoff, cklen); // cell is full and current packet has not been enqueued, if (cellFull currentPacket != null) { enqueueCurrentPacketFull(); } {code} When the last 1024 bytes of the cell was written, we meet {{cellFull}} and create another packet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627017#comment-14627017 ] Hadoop QA commented on HDFS-8058: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 13s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 37s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 7s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 27s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 188m 12s | Tests failed in hadoop-hdfs. | | | | 230m 36s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745196/HDFS-8058-HDFS-7285.007.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / b1e6429 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11701/console | This message was automatically generated. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627112#comment-14627112 ] Jing Zhao commented on HDFS-8058: - Created HDFS-8777 to add more tests for (snapshot + EC). Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8734) Erasure Coding: fix one cell need two packets
[ https://issues.apache.org/jira/browse/HDFS-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626973#comment-14626973 ] Jing Zhao commented on HDFS-8734: - The patch looks good to me. +1. One concern is that we now have more and more variables to track the states of the streamers. We can revisit them later and maybe do some code cleanup. Erasure Coding: fix one cell need two packets - Key: HDFS-8734 URL: https://issues.apache.org/jira/browse/HDFS-8734 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8734-HDFS-7285.01.patch, HDFS-8734.01.patch The default WritePacketSize is 64k Currently default cellSize is 64k We hope one cell consumes one packet. In fact it's not. By default, chunkSize = 516( 512 data + 4 checksum) packetSize = 64k chunksPerPacket = 126 ( See DFSOutputStream#computePacketChunkSize for details) numBytes of data in one packet = 64512 cellSize = 65536 When first packet is full ( with 64512 data), there are still 65536 - 64512 = 1024 bytes left. {code} super.writeChunk(bytes, offset, len, checksum, ckoff, cklen); // cell is full and current packet has not been enqueued, if (cellFull currentPacket != null) { enqueueCurrentPacketFull(); } {code} When the last 1024 bytes of the cell was written, we meet {{cellFull}} and create another packet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8776) Decom manager should not be active on standby
Daryn Sharp created HDFS-8776: - Summary: Decom manager should not be active on standby Key: HDFS-8776 URL: https://issues.apache.org/jira/browse/HDFS-8776 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The decommission manager should not be actively processing on the standby. The decomm manager goes through the costly computation for determining every block on the node requires replication yet doesn't queue them for replication - because it's in standby. The decomm manager is holding the namesystem write lock, causing DNs to timeout on heartbeats or IBRs, NN purges the call queue of timed out clients, NN processes some heartbeats/IBRs before the decomm manager locks up the namesystem again. Nodes attempting to register will be sending full BRs which are more costly to send and discard than a heartbeat. If a failover is required, the standby will likely have to struggle very hard to not GC while catching up on its queued IBRs while DNs continue to fill the call queue and time out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8775) SASL support for data transfer protocol in libhdfspp
Haohui Mai created HDFS-8775: Summary: SASL support for data transfer protocol in libhdfspp Key: HDFS-8775 URL: https://issues.apache.org/jira/browse/HDFS-8775 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai This jira proposes to implement basic SASL support for the data transfer protocol which allows libhdfspp to talk to secure clusters. Support for encryption is deferred to subsequent jiras. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8716) introduce a new config specifically for safe mode block count
[ https://issues.apache.org/jira/browse/HDFS-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated HDFS-8716: --- Attachment: HDFS-8716.7.patch add a unit test introduce a new config specifically for safe mode block count - Key: HDFS-8716 URL: https://issues.apache.org/jira/browse/HDFS-8716 Project: Hadoop HDFS Issue Type: Bug Reporter: Chang Li Assignee: Chang Li Attachments: HDFS-8716.1.patch, HDFS-8716.2.patch, HDFS-8716.3.patch, HDFS-8716.4.patch, HDFS-8716.5.patch, HDFS-8716.6.patch, HDFS-8716.7.patch During the start up, namenode waits for n replicas of each block to be reported by datanodes before exiting the safe mode. Currently n is tied to the min replicas config. We could set min replicas to more than one but we might want to exit safe mode as soon as each block has one replica reported. This can be worked out by introducing a new config variable for safe mode block count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626888#comment-14626888 ] Kihwal Lee commented on HDFS-8722: -- Thanks for the review, Arpit. I've committed this to trunk, branch-2 and branch-2.7. Optimize datanode writes for small writes and flushes - Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.1 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.2 Attachments: HDFS-8722.patch, HDFS-8722.v1.patch After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8722: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.2 Status: Resolved (was: Patch Available) Optimize datanode writes for small writes and flushes - Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.1 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.2 Attachments: HDFS-8722.patch, HDFS-8722.v1.patch After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8764) Generate Hadoop RPC stubs from protobuf definitions
[ https://issues.apache.org/jira/browse/HDFS-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8764: - Attachment: HDFS-8764.000.patch Generate Hadoop RPC stubs from protobuf definitions --- Key: HDFS-8764 URL: https://issues.apache.org/jira/browse/HDFS-8764 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8764.000.patch It would be nice to have the the RPC stubs generated from the protobuf definitions which is similar to what the HADOOP-10388 has achieved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626956#comment-14626956 ] Jing Zhao commented on HDFS-8058: - Here {{fileInPb.getIsStriped()}} should be {{file.isStriped}} since we have not persist the isStriped information into FileDiff. Or we can move setIsStriped(n.isStriped()) into {{buildINodeFile}}. Maybe the later way is more clean. Let's also create a separate jira to add more tests on the (EC + snapshot) scenario. {code} - (byte)fileInPb.getStoragePolicyID(), xAttrs); + (byte)fileInPb.getStoragePolicyID(), xAttrs, fileInPb.getIsStriped()); {code} Other than this +1 if Jenkins runs fine. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058-HDFS-7285.008.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
Jing Zhao created HDFS-8777: --- Summary: Erasure Coding: add tests for taking snapshots on EC files Key: HDFS-8777 URL: https://issues.apache.org/jira/browse/HDFS-8777 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao We need to add more tests for (EC + snapshots). The tests need to verify the fsimage saving/loading is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes
[ https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626895#comment-14626895 ] Hudson commented on HDFS-8722: -- FAILURE: Integrated in Hadoop-trunk-Commit #8163 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8163/]) HDFS-8722. Optimize datanode writes for small writes and flushes. Contributed by Kihwal Lee (kihwal: rev 59388a801514d6af64ef27fbf246d8054f1dcc74) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Optimize datanode writes for small writes and flushes - Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.1 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.2 Attachments: HDFS-8722.patch, HDFS-8722.v1.patch After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8759) Implement remote block reader in libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8759: - Attachment: HDFS-8759.001.patch Implement remote block reader in libhdfspp -- Key: HDFS-8759 URL: https://issues.apache.org/jira/browse/HDFS-8759 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch This jira tracks the effort of implementing the remote block reader that communicates with DN in libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8774) Implement FileSystem and InputStream API for libhdfspp
Haohui Mai created HDFS-8774: Summary: Implement FileSystem and InputStream API for libhdfspp Key: HDFS-8774 URL: https://issues.apache.org/jira/browse/HDFS-8774 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: HDFS-8707 This jira proposes to implement FileSystem and InputStream APIs for libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-8742: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks [~surendrasingh] for the contribution! Inotify: Support event for OP_TRUNCATE -- Key: HDFS-8742 URL: https://issues.apache.org/jira/browse/HDFS-8742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Fix For: 2.8.0 Attachments: HDFS-8742-001.patch, HDFS-8742.patch Currently inotify is not giving any event for Truncate operation. NN should send event for Truncate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626957#comment-14626957 ] Hudson commented on HDFS-8742: -- FAILURE: Integrated in Hadoop-trunk-Commit #8164 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8164/]) HDFS-8742. Inotify: Support event for OP_TRUNCATE. Contributed by Surendra Singh Lilhore. (aajisaka: rev 979c9ca2ca89e99dc7165abfa29c78d66de43d9a) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/inotify.proto * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/InotifyFSEditLogOpTranslator.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/inotify/Event.java Inotify: Support event for OP_TRUNCATE -- Key: HDFS-8742 URL: https://issues.apache.org/jira/browse/HDFS-8742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Fix For: 2.8.0 Attachments: HDFS-8742-001.patch, HDFS-8742.patch Currently inotify is not giving any event for Truncate operation. NN should send event for Truncate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627005#comment-14627005 ] Jing Zhao commented on HDFS-8433: - bq. This time, we don't need BlockIdRange class. We extends the fields of BlockTokenIdentifier. (See BlockTokenIdentifier#readFields(..) / write(..) ) Maybe more details? Not sure if I catch the idea here. How to avoid changing the readFields/writeFields but still adding a new field? blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil -- Key: HDFS-8433 URL: https://issues.apache.org/jira/browse/HDFS-8433 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Walter Su Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, HDFS-8433.01.patch The blockToken provided in LocatedStripedBlock is not used to create LocatedBlock in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil. We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8759) Implement remote block reader in libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627037#comment-14627037 ] Jing Zhao commented on HDFS-8759: - Agree with [~James Clampffer] that we need to have detailed documentation especially for class like Status. But it's fine to do this in a separate jira. +1 for the 001 patch. Implement remote block reader in libhdfspp -- Key: HDFS-8759 URL: https://issues.apache.org/jira/browse/HDFS-8759 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8759.000.patch, HDFS-8759.001.patch This jira tracks the effort of implementing the remote block reader that communicates with DN in libhdfspp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8058: Attachment: HDFS-8058-HDFS-7285.009.patch Good catch Jing! Uploading 09 patch to address the issue. {{testTruncateWithDataNodesRestartImmediately}} fails even without the patch. We should do some more debugging around it. To verify the new change I ran {{TestFSImage}}, {{TestINodeFile}} and {{TestStripedINodeFile}} locally and they all passed. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8742) Inotify: Support event for OP_TRUNCATE
[ https://issues.apache.org/jira/browse/HDFS-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626921#comment-14626921 ] Akira AJISAKA commented on HDFS-8742: - +1, the test failure looks unrelated to the patch. I confirmed the test passed locally. Inotify: Support event for OP_TRUNCATE -- Key: HDFS-8742 URL: https://issues.apache.org/jira/browse/HDFS-8742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Attachments: HDFS-8742-001.patch, HDFS-8742.patch Currently inotify is not giving any event for Truncate operation. NN should send event for Truncate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626969#comment-14626969 ] Arun Suresh commented on HDFS-7858: --- [~arpitagarwal], apologize for sitting on this... I was trying to refactor this as per [~jingzhao]'s suggestion (replacing RetryInvocationHandler with RequestHedgingInvocationHandler). Unfortunately, it was turning out to be a more far reaching impact (technically request hedging is different from retry.. so the whole policy framework etc. would need to be refactored) If everyone is ok with the current approach, we can punt the larger refactoring to another JIRA and I can incorporate [~arpitagarwal]'s suggestion (skip standby for subsequent requests) and provide a quick patch. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626950#comment-14626950 ] Arpit Agarwal commented on HDFS-7858: - Hi [~asuresh], were you thinking of posting an updated patch. The overall approach looks good. One comment from a quick look - RequestHedgingProxyProvider sends all requests to both NNs. Should it skip the standby for subsequent requests? Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8758) Implement the continuation library for libhdfspp
[ https://issues.apache.org/jira/browse/HDFS-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-8758. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-8707 Target Version/s: HDFS-8707 Committed to HDFS-8707. Thanks Jing for reviews. Implement the continuation library for libhdfspp Key: HDFS-8758 URL: https://issues.apache.org/jira/browse/HDFS-8758 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Haohui Mai Fix For: HDFS-8707 Attachments: HDFS-8758.000.patch libhdfspp uses continuations as basic building blocks to implement asynchronous operations. This jira imports the continuation library into the repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626963#comment-14626963 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12702886/HDFS-7858.3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 979c9ca | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11704/console | This message was automatically generated. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8058) Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627104#comment-14627104 ] Jing Zhao commented on HDFS-8058: - Thanks for the update, Zhe. +1 for the 09 patch. {{testTruncateWithDataNodesRestartImmediately}} has been fixed in trunk recently so we can ignore it in the feature branch now. Erasure coding: use BlockInfo[] for both striped and contiguous blocks in INodeFile --- Key: HDFS-8058 URL: https://issues.apache.org/jira/browse/HDFS-8058 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Yi Liu Assignee: Zhe Zhang Attachments: HDFS-8058-HDFS-7285.003.patch, HDFS-8058-HDFS-7285.004.patch, HDFS-8058-HDFS-7285.005.patch, HDFS-8058-HDFS-7285.006.patch, HDFS-8058-HDFS-7285.007.patch, HDFS-8058-HDFS-7285.008.patch, HDFS-8058-HDFS-7285.009.patch, HDFS-8058.001.patch, HDFS-8058.002.patch This JIRA is to use {{BlockInfo[] blocks}} for both striped and contiguous blocks in INodeFile. Currently {{FileWithStripedBlocksFeature}} keeps separate list for striped blocks, and the methods there duplicate with those in INodeFile, and current code need to judge {{isStriped}} then do different things. Also if file is striped, the {{blocks}} in INodeFile occupy a reference memory space. These are not necessary, and we can use the same {{blocks}} to make code more clear. I keep {{FileWithStripedBlocksFeature}} as empty for follow use: I will file a new JIRA to move {{dataBlockNum}} and {{parityBlockNum}} from *BlockInfoStriped* to INodeFile, since ideally they are the same for all striped blocks in a file, and store them in block will waste NN memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627215#comment-14627215 ] Walter Su commented on HDFS-8433: - I was thinking {code} public void readFields(DataInput in) throws IOException { ... for (int i = 0; i length; i++) { modes.add(WritableUtils.readEnum(in, AccessMode.class)); } + idRange = WritableUtils.readVLong(in); } @Override public void write(DataOutput out) throws IOException { ... for (AccessMode aMode : modes) { WritableUtils.writeEnum(out, aMode); } + WritableUtils.writeVLong(out, idRange); } {code} A token generated by new NN can be parsed by old DN. A token generated by old DN can be parsed by old DN. A token generated by old DN can't be parsed by new DN. I hope DN don't generate token then there is no problem. Actually {{DataNode.DataTransfer}} generates token. Now I think this idea is a bad idea. I miss protobuf now but nothing we can do. blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil -- Key: HDFS-8433 URL: https://issues.apache.org/jira/browse/HDFS-8433 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Walter Su Attachments: HDFS-8433-HDFS-7285.02.patch, HDFS-8433.00.patch, HDFS-8433.01.patch The blockToken provided in LocatedStripedBlock is not used to create LocatedBlock in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil. We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8619) Erasure Coding: revisit replica counting for striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8619: Attachment: HDFS-8619-HDFS-7285.001.patch Update the patch. Instead of tracking BlockInfo in CorruptReplicasMap, the 001 patch uses a Block with the same ID of the striped block group. Erasure Coding: revisit replica counting for striped blocks --- Key: HDFS-8619 URL: https://issues.apache.org/jira/browse/HDFS-8619 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-8619-HDFS-7285.001.patch, HDFS-8619.000.patch Currently we use the same {{BlockManager#countNodes}} method for striped blocks, which simply treat each internal block as a replica. However, for a striped block, we may have more complicated scenario, e.g., we have multiple replicas of the first internal block while we miss some other internal blocks. Using the current {{countNodes}} methods can lead to wrong decision in these scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8769: Summary: Erasure Coding: unit test for SequentialBlockGroupIdGenerator (was: unit test for SequentialBlockGroupIdGenerator) Erasure Coding: unit test for SequentialBlockGroupIdGenerator - Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Rakesh R -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8778) TestBlockReportRateLimiting#testLeaseExpiration can deadlock
[ https://issues.apache.org/jira/browse/HDFS-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627453#comment-14627453 ] Hadoop QA commented on HDFS-8778: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 7m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 37s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 24s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 5s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 27s | Tests failed in hadoop-hdfs. | | | | 184m 0s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12745354/HDFS-8778.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 0a16ee6 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11707/console | This message was automatically generated. TestBlockReportRateLimiting#testLeaseExpiration can deadlock Key: HDFS-8778 URL: https://issues.apache.org/jira/browse/HDFS-8778 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8778.01.patch {{requestBlockReportLease}} blocks on DataNode registration while holding the NameSystem read lock. DataNode registration can block on the NameSystem read lock if a writer gets in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)