[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553735#comment-14553735 ] Walter Su commented on HDFS-7621: - 006 patch changes: 1. rebase; 2. update {{blockManager}} logic Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, HDFS-7621.006.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7621: Attachment: HDFS-7621.006.patch Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, HDFS-7621.006.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8433) blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil
[ https://issues.apache.org/jira/browse/HDFS-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555454#comment-14555454 ] Walter Su commented on HDFS-8433: - Hi, [~szetszwo]! I'm wiling to work on this if you are too busy. Or you already made progress. blockToken is not set in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil -- Key: HDFS-8433 URL: https://issues.apache.org/jira/browse/HDFS-8433 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze The blockToken provided in LocatedStripedBlock is not used to create LocatedBlock in constructInternalBlock and parseStripedBlockGroup in StripedBlockUtil. We should also add ec tests with security on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Resolution: Won't Fix Status: Resolved (was: Patch Available) Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555476#comment-14555476 ] Walter Su commented on HDFS-8441: - Thanks Kai. I commited this. Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Fix For: HDFS-7285 Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch, HDFS-8441-HDFS-7285.003.patch, HDFS-8441-HDFS-7285.004.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Resolution: Fixed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Fix For: HDFS-7285 Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch, HDFS-8441-HDFS-7285.003.patch, HDFS-8441-HDFS-7285.004.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: HDFS-8186-HDFS-7285.002.txt HDFS-8186.001.txt history patch. Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Fix For: HDFS-7285 Attachments: HDFS-8186-HDFS-7285.002.txt, HDFS-8186-HDFS-7285.003.patch, HDFS-8186.001.txt This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Priority: Trivial (was: Major) Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Trivial Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1472#comment-1472 ] Walter Su commented on HDFS-8342: - {{TestWriteReadStripedFile.verifySeek()}} has similar test. I just thought {{TestDFSStripedInputStream}} is a better place for unit test. It's trival. The previous one works too. Let's move forward to a thorough system test. Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize
[ https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555623#comment-14555623 ] Walter Su commented on HDFS-8220: - {code} public DatanodeStorageInfo[] chooseTarget4NewBlock(final String src, ... final DatanodeStorageInfo[] targets = blockplacement.chooseTarget(src, numOfReplicas, client, excludedNodes, blocksize, favoredDatanodeDescriptors, storagePolicy); if (targets.length minReplication) { throw new IOException(File + src + could only be replicated to ... {code} We could add same logic here for striping. Since client can't handle this situation(short of nodes/targets), we let NN throw IOException to notify client. Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize --- Key: HDFS-8220 URL: https://issues.apache.org/jira/browse/HDFS-8220 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.008.patch During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to validate the available datanodes against the {{BlockGroupSize}}. Please see the exception to understand more: {code} 2015-04-22 14:56:11,313 WARN hdfs.DFSClient (DataStreamer.java:run(538)) - DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) 2015-04-22 14:56:11,313 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387 java.io.IOException: DataStreamer Exception: at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) ... 1 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8444) Erasure Coding: fix cannot rename a zone dir
[ https://issues.apache.org/jira/browse/HDFS-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8444: Status: Patch Available (was: Open) Erasure Coding: fix cannot rename a zone dir Key: HDFS-8444 URL: https://issues.apache.org/jira/browse/HDFS-8444 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8444-HDFS-7285.001.patch We create a EC zone {{/my_ec_zone}}. We want to rename it to {{/myZone}}. But it failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8444) Erasure Coding: fix cannot rename a zone dir
[ https://issues.apache.org/jira/browse/HDFS-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8444: Attachment: HDFS-8444-HDFS-7285.001.patch Erasure Coding: fix cannot rename a zone dir Key: HDFS-8444 URL: https://issues.apache.org/jira/browse/HDFS-8444 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8444-HDFS-7285.001.patch We create a EC zone {{/my_ec_zone}}. We want to rename it to {{/myZone}}. But it failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8444) Erasure Coding: fix cannot rename a zone dir
Walter Su created HDFS-8444: --- Summary: Erasure Coding: fix cannot rename a zone dir Key: HDFS-8444 URL: https://issues.apache.org/jira/browse/HDFS-8444 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su We create a EC zone {{/my_ec_zone}}. We want to rename it to {{/myZone}}. But it failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Description: Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count 3. support create EC file with replication=0 was: Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Description: Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 was: Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count 3. support create EC file with replication=0 Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Attachment: HDFS-8441-HDFS-7285.002.patch Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552069#comment-14552069 ] Walter Su commented on HDFS-8441: - 002 patch add support create EC file with replication=0 Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8444) Erasure Coding: fix cannot rename a zone dir
[ https://issues.apache.org/jira/browse/HDFS-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553474#comment-14553474 ] Walter Su commented on HDFS-8444: - jenkins not related. Please review. Erasure Coding: fix cannot rename a zone dir Key: HDFS-8444 URL: https://issues.apache.org/jira/browse/HDFS-8444 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8444-HDFS-7285.001.patch We create a EC zone {{/my_ec_zone}}. We want to rename it to {{/myZone}}. But it failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553511#comment-14553511 ] Walter Su commented on HDFS-8441: - ... Maybe we could add a method in {{FSDirectory#isInECZone(src)}} and use it to tell the condition? Good idea. I found {{dir#isInECZone(iip)}} exists. So I add {{ns#isInEcZone(src)}}. I think usually {{ns}} takes {{src}} arg, and {{dir}} takes {{iip}}. ... Please add message to the fail() statement like... Done. Thank you both. Uploaded 003 patch. Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch, HDFS-8441-HDFS-7285.003.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Attachment: HDFS-8441-HDFS-7285.003.patch Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch, HDFS-8441-HDFS-7285.002.patch, HDFS-8441-HDFS-7285.003.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update storage count 3. support create EC file with replication=0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8333) Create EC zone should not need superuser privilege
[ https://issues.apache.org/jira/browse/HDFS-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550012#comment-14550012 ] Walter Su commented on HDFS-8333: - Patch looks good. I'm +1 for this idea. Hi, [~drankye], and [~zhz]. How do you think about it? Create EC zone should not need superuser privilege -- Key: HDFS-8333 URL: https://issues.apache.org/jira/browse/HDFS-8333 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8333-HDFS-7285.000.patch create EC zone should not need superuser privilege, for example, in multiple tenant scenario, common users only manage their own directory and subdirectory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8378) Erasure Coding: Few improvements for the erasure coding worker
[ https://issues.apache.org/jira/browse/HDFS-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549938#comment-14549938 ] Walter Su commented on HDFS-8378: - Thanks [~rakeshr] for the contribution! It's committed in the branch. Erasure Coding: Few improvements for the erasure coding worker -- Key: HDFS-8378 URL: https://issues.apache.org/jira/browse/HDFS-8378 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Priority: Minor Fix For: HDFS-7285 Attachments: HDFS-8378-HDFS-7285.00.patch # Following log is confusing, make it tidy. Its missing {{break;}} statement and causing this unwanted logs. {code} 2015-05-10 15:06:45,878 INFO datanode.DataNode (BPOfferService.java:processCommandFromActive(728)) - DatanodeCommand action: DNA_ERASURE_CODING_RECOVERY 2015-05-10 15:06:45,879 WARN datanode.DataNode (BPOfferService.java:processCommandFromActive(732)) - Unknown DatanodeCommand action: 11 {code} # Add exception trace to the log, would improve debuggability {code} } catch (Throwable e) { LOG.warn(Failed to recover striped block: + blockGroup); } {code} # Make member variables present in ErasureCodingWorker, ReconstructAndTransferBlock, StripedReader {{private}} {{final}} # Correct spelling of the variable {{STRIPED_READ_TRHEAD_POOL}} to {{STRIPED_READ_THREAD_POOL}} # Good to add debug logs to print the striped read pool size {code} LOG.debug(Using striped reads; pool threads= + num); {code} # Add meaningful message to the precondition check: {code} Preconditions.checkArgument(liveIndices.length == sources.length); {code} # Remove unused import {code} import org.apache.hadoop.hdfs.server.common.HdfsServerConstants; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8378) Erasure Coding: Few improvements for the erasure coding worker
[ https://issues.apache.org/jira/browse/HDFS-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8378: Resolution: Fixed Fix Version/s: HDFS-7285 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Erasure Coding: Few improvements for the erasure coding worker -- Key: HDFS-8378 URL: https://issues.apache.org/jira/browse/HDFS-8378 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Priority: Minor Fix For: HDFS-7285 Attachments: HDFS-8378-HDFS-7285.00.patch # Following log is confusing, make it tidy. Its missing {{break;}} statement and causing this unwanted logs. {code} 2015-05-10 15:06:45,878 INFO datanode.DataNode (BPOfferService.java:processCommandFromActive(728)) - DatanodeCommand action: DNA_ERASURE_CODING_RECOVERY 2015-05-10 15:06:45,879 WARN datanode.DataNode (BPOfferService.java:processCommandFromActive(732)) - Unknown DatanodeCommand action: 11 {code} # Add exception trace to the log, would improve debuggability {code} } catch (Throwable e) { LOG.warn(Failed to recover striped block: + blockGroup); } {code} # Make member variables present in ErasureCodingWorker, ReconstructAndTransferBlock, StripedReader {{private}} {{final}} # Correct spelling of the variable {{STRIPED_READ_TRHEAD_POOL}} to {{STRIPED_READ_THREAD_POOL}} # Good to add debug logs to print the striped read pool size {code} LOG.debug(Using striped reads; pool threads= + num); {code} # Add meaningful message to the precondition check: {code} Preconditions.checkArgument(liveIndices.length == sources.length); {code} # Remove unused import {code} import org.apache.hadoop.hdfs.server.common.HdfsServerConstants; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Attachment: HDFS-8441-HDFS-7285.001.patch Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Attachment: (was: HDFS-8441-HDFS-7285.001.patch) Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551617#comment-14551617 ] Walter Su commented on HDFS-8186: - Thanks Zhe Zhang. Jenkins didn't come out. I am trying re-trigger it. Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.003.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
Walter Su created HDFS-8441: --- Summary: Erasure Coding: make condition check earlier for setReplication Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Attachment: HDFS-8441-HDFS-7285.001.patch Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8441) Erasure Coding: make condition check earlier for setReplication
[ https://issues.apache.org/jira/browse/HDFS-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8441: Status: Patch Available (was: Open) Erasure Coding: make condition check earlier for setReplication --- Key: HDFS-8441 URL: https://issues.apache.org/jira/browse/HDFS-8441 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8441-HDFS-7285.001.patch Changes: 1. {{UnsupportedActionException}} is more user-firendly. 2. check condition before update quota count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8438) Erasure Coding: support concat files in same EC zone
Walter Su created HDFS-8438: --- Summary: Erasure Coding: support concat files in same EC zone Key: HDFS-8438 URL: https://issues.apache.org/jira/browse/HDFS-8438 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: (was: HDFS-8186-HDFS-7285.003.patch) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Status: Open (was: Patch Available) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.003.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: HDFS-8186-HDFS-7285.003.patch Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.003.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Status: Patch Available (was: Open) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.003.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: (was: HDFS-8186.001.patch) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
[ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550433#comment-14550433 ] Walter Su commented on HDFS-8131: - Hi, [~kihwal]! Could you take a look at HDFS-8390? It's related. Do you think it's necessary or not? Implement a space balanced block placement policy - Key: HDFS-8131 URL: https://issues.apache.org/jira/browse/HDFS-8131 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 3.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Labels: BlockPlacementPolicy Fix For: 2.8.0 Attachments: HDFS-8131-v1.diff, HDFS-8131-v2.diff, HDFS-8131-v3.diff, HDFS-8131.004.patch, HDFS-8131.005.patch, HDFS-8131.006.patch, balanced.png The default block placement policy will choose datanodes for new blocks randomly, which will result in unbalanced space used percent among datanodes after an cluster expansion. The old datanodes always are in high used percent of space and new added ones are in low percent. Through we can used the external balance tool to balance the space used rate, it will cost extra network IO and it's not easy to control the balance speed. An easy solution is to implement an balanced block placement policy which will choose low used percent datanodes for new blocks with a little high possibility. In a not long term, the used percent of datanodes will trend to be balanced. Suggestions and discussions are welcomed. Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
Walter Su created HDFS-8430: --- Summary: Erasure coding: update DFSClient.getFileChecksum() logic for stripe files Key: HDFS-8430 URL: https://issues.apache.org/jira/browse/HDFS-8430 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su HADOOP-3981 introduces a distributed file checksum algorithm. It's designed for replicated block. {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Labels: BlockPlacementPolicy (was: ) merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Labels: BlockPlacementPolicy Attachments: HDFS-8390.001.patch, HDFS-8390.002.patch, HDFS-8390.003.patch, HDFS-8390.004.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, HDFS-8131, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8417: Status: Patch Available (was: Open) Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8417-HDFS-7285.001.patch, HDFS-8417-HDFS-7285.002.patch When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547998#comment-14547998 ] Walter Su commented on HDFS-8417: - {{TestDFSStripedInputStream}} doesn't depend on {{DFSStripedOutputStream}}. It's different from {{TestWriteReadStripedFile}}. So It's necessary to include tests. 002 patch includes tests. Please review. Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8417-HDFS-7285.001.patch, HDFS-8417-HDFS-7285.002.patch When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8417: Attachment: HDFS-8417-HDFS-7285.002.patch Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8417-HDFS-7285.001.patch, HDFS-8417-HDFS-7285.002.patch When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8417: Attachment: HDFS-8417-HDFS-7285.001.patch Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8417-HDFS-7285.001.patch When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: HDFS-8352-HDFS-7285.001.patch Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547688#comment-14547688 ] Walter Su commented on HDFS-8417: - tests included in HDFS-8352. 8352 tests fails for now. It will pass if this patch applied. Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8417-HDFS-7285.001.patch When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8417) Erasure Coding: Pread failed to read data starting from not-first stripe
[ https://issues.apache.org/jira/browse/HDFS-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8417: Summary: Erasure Coding: Pread failed to read data starting from not-first stripe (was: Erasure Coding: Pread failed to read data starting from last incomplete stripe) Erasure Coding: Pread failed to read data starting from not-first stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: (was: HDFS-8352-HDFS-7285.001.patch) Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549644#comment-14549644 ] Walter Su commented on HDFS-8186: - ...Looks like we should add a TODO that webhdfs on striped files will be supported later? It's done, not a TODO. Webhdfs already support write striped file (HDFS-8352). {{chooseTarget4WebHDFS}} return a proxy DN. We only requires proxy DN closer to client. Proxy DN is or is not one target DN. Proxy DN will create a DFSClient to redirect data from client to target DNs. ( You can see the details from HDFS-2316) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.002.patch, HDFS-8186.001.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: HDFS-8186-HDFS-7285.003.patch Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.002.patch, HDFS-8186-HDFS-7285.003.patch, HDFS-8186.001.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549666#comment-14549666 ] Walter Su commented on HDFS-8352: - I'm very sorry. I'll keep it in mind. Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8339) Erasure Coding: Badly treated when createBlockOutputStream failed in DataStreamer
[ https://issues.apache.org/jira/browse/HDFS-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su resolved HDFS-8339. - Resolution: Duplicate Erasure Coding: Badly treated when createBlockOutputStream failed in DataStreamer -- Key: HDFS-8339 URL: https://issues.apache.org/jira/browse/HDFS-8339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su h4. Issue 1: leading streamer calls abandonBlock(..) and get new blockGroup from NN, primary streamer should sync with non-leading streamer instead of throw exception( offer twice to stripedBlock blockingQueue). {noformat} 2015-05-07 18:58:05,335 INFO hdfs.DataStreamer (DataStreamer.java:nextBlockOutputStream(1386)) - Abandoning BP-172584615-9.96.1.34-1430996280714:blk_-9223372036854775792_1001 ... 2015-05-07 18:58:05,373 WARN hdfs.DataStreamer (DataStreamer.java:run(572)) - DataStreamer Exception java.io.IOException: Failed: LocatedBlock{BP-172584615-9.96.1.34-1430996280714:blk_-9223372036854775770_1002; getBlockSize()=0; corrupt=false; offset=1572864; locs=[DatanodeInfoWithStorage[127.0.0.1:52490,DS-6080b76f-adf7-45a8-aa0e-e0e82c2c1569,DISK]]}, i=6 at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:117) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1364) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:461) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) {noformat} h4. Issue 2: non-leading streamer calls abandonBlock(..) and get new locatedBlock from coordinator. Actually it's the last blockGroup, no more locatedBlock can poll from stripedBlocks blockingQueue. Other 8 streamer finished and closed, but this streamer hangs about 90 seconds. {noformat} 2015-05-07 19:21:25,357 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2768)) - BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:51998 is added to ... 2015-05-07 19:22:55,250 WARN hdfs.DataStreamer (DataStreamer.java:run(572)) - DataStreamer Exception java.io.IOException: Failed: i=1 at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.getStripedBlock(DFSStripedOutputStream.java:130) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:124) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1364) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:461) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) {noformat} h4. Issue 3: remove abandonBlock(..) RPC call for non-leading streamer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Resolution: Fixed Target Version/s: HDFS-7285 Status: Resolved (was: Patch Available) The jenkens issues are not related. I just commited this. Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8199) Erasure Coding: System test of creating ECZone and EC files.
[ https://issues.apache.org/jira/browse/HDFS-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547560#comment-14547560 ] Walter Su commented on HDFS-8199: - Sorry. Mistake. normal files: 13m11s ec files: 45m33s Erasure Coding: System test of creating ECZone and EC files. Key: HDFS-8199 URL: https://issues.apache.org/jira/browse/HDFS-8199 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Kai Sasaki Assignee: Yong Zhang Attachments: HDFS-8199.000.patch System test of creating ECZone and EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: (was: HDFS-8352-HDFS-7285.001.patch) Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: HDFS-8352-HDFS-7285.001.patch Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8417) Erasure Coding: Pread failed to read data starting from last incomplete stripe
Walter Su created HDFS-8417: --- Summary: Erasure Coding: Pread failed to read data starting from last incomplete stripe Key: HDFS-8417 URL: https://issues.apache.org/jira/browse/HDFS-8417 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su When file length is cellSize * dataBlocks +123, and Pread start from cellSize * dataBlocks + 1, it will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545421#comment-14545421 ] Walter Su commented on HDFS-7621: - 005 patch totally rewrite, except tests. Please review. hints for reviewing: 1. new class {{StripedBlockWithLocations}}, {{DBlockStriped}} to represent blockgroup 2. PendingMove represents a (source, target, reportedBlock) triple. The newly created PendingMove is (source, target, null) {{PendingMove.markMovedIfGoodBlock}} get block/blockgroup from Source, and validate block/blockgroup, and parse it, and save reported block to itself. After that, PendingMove represents a (source, target, reportedBlock) triple, it's final, and will be dispatched. Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: HDFS-8390.004.patch merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch, HDFS-8390.002.patch, HDFS-8390.003.patch, HDFS-8390.004.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, HDFS-8131, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Attachment: HDFS-8342-HDFS-7285.001.patch Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: (was: HDFS-8352-HDFS-7285.001.repost.patch) Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7621: Attachment: HDFS-7621.005.patch Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: HDFS-8352-HDFS-7285.001.patch Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: (was: HDFS-8352-HDFS-7285.001.patch) Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Attachment: (was: HDFS-8342-HDFS-7285.001.repost.patch) Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Attachment: (was: HDFS-8342-HDFS-7285.001.patch) Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Description: We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, HDFS-8131, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. was: We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch, HDFS-8390.002.patch, HDFS-8390.003.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, HDFS-8131, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Attachment: HDFS-8342-HDFS-7285.001.repost.patch Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch, HDFS-8342-HDFS-7285.001.repost.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Attachment: HDFS-8352-HDFS-7285.001.repost.patch Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch, HDFS-8352-HDFS-7285.001.repost.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7621: Attachment: HDFS-7621.004.patch Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543305#comment-14543305 ] Walter Su commented on HDFS-7621: - ...moveBlockAcrossStorage requires Mover send correct numOfBytes, otherwise DN reject the command. Problem solved. Mover parses numBytes of internal block from ECSchema from HdfsFileStatus. Balancer gets numBytes of internal block from NN which parsed using BlockInfoStriped. Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch, HDFS-7621.004.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: HDFS-8390.003.patch merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch, HDFS-8390.002.patch, HDFS-8390.003.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8199) Erasure Coding: System test of creating ECZone and EC files.
[ https://issues.apache.org/jira/browse/HDFS-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543488#comment-14543488 ] Walter Su commented on HDFS-8199: - Nodes: 1NN + 9DN + 1 Client Client node writes 100G (1G * 100) files to ec zone and non-ec dir. normal files: 45m33s ec files: 13m11s cpu usages both around 100%~150%. Erasure Coding: System test of creating ECZone and EC files. Key: HDFS-8199 URL: https://issues.apache.org/jira/browse/HDFS-8199 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Kai Sasaki Assignee: Yong Zhang Attachments: HDFS-8199.000.patch System test of creating ECZone and EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543430#comment-14543430 ] Walter Su commented on HDFS-8186: - 1. ...Balancer and NamenodeFsck. Should we address these 2 usages in this patch? If not, will NamenodeFsck work correctly? Change them to use default policy, as they used to. If they need ecPolicy. We change Balancer/fsck code in corresponding jira. 2. Move {{BlockPlacementPolicies}} to new file. So Balancer/fsck can use it. 3. I don't see why we need a test. It'll be fine if it passes existed tests. Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.002.patch, HDFS-8186.001.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Status: Patch Available (was: Open) Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.002.patch, HDFS-8186.001.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8186) Erasure coding: Make block placement policy for EC file configurable
[ https://issues.apache.org/jira/browse/HDFS-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8186: Attachment: HDFS-8186-HDFS-7285.002.patch Erasure coding: Make block placement policy for EC file configurable Key: HDFS-8186 URL: https://issues.apache.org/jira/browse/HDFS-8186 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8186-HDFS-7285.002.patch, HDFS-8186.001.patch This includes: 1. User can config block placement policy for EC file in xml configuration file. 2. EC policy works for EC file, replication policy works for non-EC file. They are coexistent. Not includes: 1. Details of block placement policy for EC. Discussion and implementation goes to HDFS-7613. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: HDFS-8390.001.patch merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Description: We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. was: We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: (was: HDFS-8390.001.patch) merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: HDFS-8390.001.patch merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Status: Patch Available (was: Open) merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
Walter Su created HDFS-8390: --- Summary: merge BlockPlacementPolicyWithNodeGroup with default policy Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543136#comment-14543136 ] Walter Su commented on HDFS-7621: - Thanks [~zhz] for helpful comments. ...convertToBlockWithLocations does similar things as the updated addBlock method clean code add Javadoc. It's more clear. They're different logic. ... I still think it's easier to just include an final int[] blockIndices parsing at NN has smaller code. Also the changes to Dispatcher is small. ...nonCollocatedBlock only makes sense for striped block group add javadoc to nonCollocatedBlock. You can think of it as normal block need to avoid collocation. We can implement collocation the same way in the future if we have requirement. ...why Balancer is using this BlocksWithLocations structure while Mover uses LocatedBlocks Balancer handles block and doesn't know file. Balancer gets blocks from Node. Mover knows file. Mover gets blocks from file. ...moveBlockAcrossStorage requires Mover send correct numOfBytes, otherwise DN reject the command. My concern is {{moveBlockAcrossStorage}}, HDFS-8289 can't solve the problem. Balancer doesn't know file. Is it necessary to check corruption when moving across storage inside DN? Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8390) merge BlockPlacementPolicyWithNodeGroup with default policy
[ https://issues.apache.org/jira/browse/HDFS-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8390: Attachment: HDFS-8390.002.patch merge BlockPlacementPolicyWithNodeGroup with default policy --- Key: HDFS-8390 URL: https://issues.apache.org/jira/browse/HDFS-8390 Project: Hadoop HDFS Issue Type: Improvement Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8390.001.patch, HDFS-8390.002.patch We saw requirements for adding new policies.( HDFS-7613, HDFS-7892, maybe HDFS-4894, HDFS-7068 in the future). Every policy need to support NodeGroup. Assume we have N policies: {noformat} BlockPlacementPolicyDefault BlockPlacementPolicyCustomized1 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized2 extends BlockPlacementPolicyDefault BlockPlacementPolicyCustomized3 extends BlockPlacementPolicyDefault {noformat} We need to implements another N policies: {noformat} BlockPlacementPolicyyWithNodeGroup BlockPlacementPolicyCustomized1WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized2WithNodeGroup extends BlockPlacementPolicyWithNodeGroup BlockPlacementPolicyCustomized3WithNodeGroup extends BlockPlacementPolicyWithNodeGroup {noformat} We had better merge nodeGroup awareness into default policy. So every new policy only need to extends BlockPlacementPolicyDefault. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic
[ https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7621: Attachment: HDFS-7621.003.patch Erasure Coding: update the Balancer/Mover data migration logic -- Key: HDFS-7621 URL: https://issues.apache.org/jira/browse/HDFS-7621 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Walter Su Labels: HDFS-7285 Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, HDFS-7621.003.patch Currently the Balancer/Mover only considers the distribution of replicas of the same block during data migration: the migration cannot decrease the number of racks. With EC the Balancer and Mover should also take into account the distribution of blocks belonging to the same block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
[ https://issues.apache.org/jira/browse/HDFS-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8364: Description: issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: We create zone twice at / {noformat} # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/ {noformat} It doesn't print detail error message. was: issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: {noformat} # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://datasight-187:9000/ {noformat} But we can create / zone by API. Erasure coding: fix some minor bugs in EC CLI - Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at
[jira] [Updated] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
[ https://issues.apache.org/jira/browse/HDFS-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8364: Attachment: HDFS-8364-HDFS-7285.001.patch Erasure coding: fix some minor bugs in EC CLI - Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8364-HDFS-7285.001.patch issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: We create zone twice at / {noformat} # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/ {noformat} It doesn't print detail error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
[ https://issues.apache.org/jira/browse/HDFS-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8364: Status: Patch Available (was: Open) Erasure coding: fix some minor bugs in EC CLI - Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8364-HDFS-7285.001.patch issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: We create zone twice at / {noformat} # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/ {noformat} It doesn't print detail error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
[ https://issues.apache.org/jira/browse/HDFS-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539444#comment-14539444 ] Walter Su commented on HDFS-8364: - after 001 patch: # bin/hdfs erasurecode -getZoneInfo / Path hdfs://ds-34:9000/ is not in EC zone # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/. Directory / is already in an erasure coding zone. Erasure coding: fix some minor bugs in EC CLI - Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8364-HDFS-7285.001.patch issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: We create zone twice at / {noformat} # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/ {noformat} It doesn't print detail error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
[ https://issues.apache.org/jira/browse/HDFS-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8364: Attachment: HDFS-8364-HDFS-7285.002.patch Erasure coding: fix some minor bugs in EC CLI - Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8364-HDFS-7285.001.patch, HDFS-8364-HDFS-7285.002.patch issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: We create zone twice at / {noformat} # bin/hdfs erasurecode -createZone / EC Zone created successfully at hdfs://ds-34:9000/ # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://ds-34:9000/ {noformat} It doesn't print detail error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8323) Bump GenerationStamp for write faliure in DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539696#comment-14539696 ] Walter Su commented on HDFS-8323: - 1. {{DFSStripedOutputStream}} owns the streamer lock when call {{setExternalError}}. But streamer Thread itself set {{hasError=false}} without synchronized keyword. 2. It's hard to describe. Firstly look at the code below. {code:title=BlockManager.java} 3946 public LocatedBlock newLocatedBlock(ExtendedBlock eb, BlockInfo info, 3947 DatanodeStorageInfo[] locs, long offset) throws IOException { 3948 final LocatedBlock lb; 3949 if (info.isStriped()) { 3950 lb = newLocatedStripedBlock(eb, locs, 3951 ((BlockInfoStripedUnderConstruction)info).getBlockIndices(), 3952 offset, false); 3953 } else { 3954 lb = newLocatedBlock(eb, locs, offset, false); 3955 } 3956 setBlockToken(lb, BlockTokenIdentifier.AccessMode.WRITE); 3957 return lb; 3958 } {code} The returned indices is from {{BlockInfoStripedUnderConstruction.getBlockIndices()}} whose length is depended upon blockreport, and is dynamic. It could be problematic. Consider the following situations: *Situation A* 1. create 9 streamers. 2. streamer #5 failed in the first place. 3. Since #5 didn't connect to some DN. So when file is created, only 8 UC blocks created. 4. last BlockInfoStripedUnderConstruction has 8 replicas reported. 5. #0 leading streamer recover the pipeline, get a new locatedBlock. 6. StripedBlockUtil.parseStripedBlockGroup(..) create locatedBlock\[9\]. 7. locatedBlock\[5\] == null 8. #0 leading streamer encounter NPE {noformat} 2015-05-12 18:47:11,117 WARN hdfs.DataStreamer (DataStreamer.java:run(572)) - DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.offer(DFSStripedOutputStream.java:73) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:133) at org.apache.hadoop.hdfs.StripedDataStreamer.putLoactedBlocks(StripedDataStreamer.java:129) at org.apache.hadoop.hdfs.StripedDataStreamer.updateBlockForPipeline(StripedDataStreamer.java:136) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1291) at org.apache.hadoop.hdfs.DataStreamer.processDatanodeError(DataStreamer.java:1022) {noformat} *Situation B* 2 DN failed in a row. Same problem with situation A. btw. I saw some merging conflicts with HDFS-8220. Could you take a look at it and see if the changes is ok with you? Bump GenerationStamp for write faliure in DFSStripedOutputStream Key: HDFS-8323 URL: https://issues.apache.org/jira/browse/HDFS-8323 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8323_20150511.patch, h8323_20150511b.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8365) Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer
Walter Su created HDFS-8365: --- Summary: Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer Key: HDFS-8365 URL: https://issues.apache.org/jira/browse/HDFS-8365 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Currently, each innerBlock of blockGroup should put on different node. One node can has 2 innerBlock. If one node has 2 innerBlock, we have blockReport issue. The first reported innerBlock will be added to triplets, but the second won't. If we decide to not to support 2 innerBlock in one node. We should handle this situation, and output warning friendly. When there are only 8 DN, and ECSchema is RS-6-3 {noformat} # bin/hdfs dfs -put README.txt /ecdir 15/05/11 13:48:30 WARN hdfs.DataStreamer: DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#0: isFailed? f, null@null java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.set(DataStreamer.java:183) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:571) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) ... 1 more 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#1: isFailed? f, null@null java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:208) at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:146) at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:713) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8365) Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer
[ https://issues.apache.org/jira/browse/HDFS-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su resolved HDFS-8365. - Resolution: Duplicate Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer --- Key: HDFS-8365 URL: https://issues.apache.org/jira/browse/HDFS-8365 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Currently, each innerBlock of blockGroup should put on different node. One node can't has 2 innerBlock. If one node has 2 innerBlock, we have blockReport issue. The first reported innerBlock will be added to triplets, but the second won't. If we decide to not to support 2 innerBlock in one node. We should handle this situation, and output warning friendly. When there are only 8 DN, and ECSchema is RS-6-3 {noformat} # bin/hdfs dfs -put README.txt /ecdir 15/05/11 13:48:30 WARN hdfs.DataStreamer: DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#0: isFailed? f, null@null java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.set(DataStreamer.java:183) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:571) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) ... 1 more 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#1: isFailed? f, null@null java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:208) at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:146) at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:713) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8365) Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer
[ https://issues.apache.org/jira/browse/HDFS-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8365: Description: Currently, each innerBlock of blockGroup should put on different node. One node can't has 2 innerBlock. If one node has 2 innerBlock, we have blockReport issue. The first reported innerBlock will be added to triplets, but the second won't. If we decide to not to support 2 innerBlock in one node. We should handle this situation, and output warning friendly. When there are only 8 DN, and ECSchema is RS-6-3 {noformat} # bin/hdfs dfs -put README.txt /ecdir 15/05/11 13:48:30 WARN hdfs.DataStreamer: DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#0: isFailed? f, null@null java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.set(DataStreamer.java:183) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:571) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) ... 1 more 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#1: isFailed? f, null@null java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:208) at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:146) at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:713) {noformat} was: Currently, each innerBlock of blockGroup should put on different node. One node can has 2 innerBlock. If one node has 2 innerBlock, we have blockReport issue. The first reported innerBlock will be added to triplets, but the second won't. If we decide to not to support 2 innerBlock in one node. We should handle this situation, and output warning friendly. When there are only 8 DN, and ECSchema is RS-6-3 {noformat} # bin/hdfs dfs -put README.txt /ecdir 15/05/11 13:48:30 WARN hdfs.DataStreamer: DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#0: isFailed? f, null@null java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.set(DataStreamer.java:183) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:571) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) ... 1 more 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#1: isFailed? f, null@null
[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize
[ https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537729#comment-14537729 ] Walter Su commented on HDFS-8220: - You moved assert nodes == The location is good. I think assert is not enough. Assertion is disable default. ...For the safer side, IMHO we could do a validation at the StripedDataStreamer to avoid NPE now. That's what i mean. Maybe throw an IOException, and add some friendly messages. NPE is awful. Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize --- Key: HDFS-8220 URL: https://issues.apache.org/jira/browse/HDFS-8220 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to validate the available datanodes against the {{BlockGroupSize}}. Please see the exception to understand more: {code} 2015-04-22 14:56:11,313 WARN hdfs.DFSClient (DataStreamer.java:run(538)) - DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) 2015-04-22 14:56:11,313 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387 java.io.IOException: DataStreamer Exception: at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) ... 1 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8365) Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer
[ https://issues.apache.org/jira/browse/HDFS-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537686#comment-14537686 ] Walter Su commented on HDFS-8365: - Sorry I didn't saw it. Of course. Erasure Coding: Badly treated when short of Datanode in StripedDataStreamer --- Key: HDFS-8365 URL: https://issues.apache.org/jira/browse/HDFS-8365 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Currently, each innerBlock of blockGroup should put on different node. One node can't has 2 innerBlock. If one node has 2 innerBlock, we have blockReport issue. The first reported innerBlock will be added to triplets, but the second won't. If we decide to not to support 2 innerBlock in one node. We should handle this situation, and output warning friendly. When there are only 8 DN, and ECSchema is RS-6-3 {noformat} # bin/hdfs dfs -put README.txt /ecdir 15/05/11 13:48:30 WARN hdfs.DataStreamer: DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#0: isFailed? f, null@null java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.set(DataStreamer.java:183) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:571) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:48) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:410) at org.apache.hadoop.hdfs.DFSStripedOutputStream$Coordinator.putStripedBlock(DFSStripedOutputStream.java:115) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:120) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1360) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:457) ... 1 more 15/05/11 13:48:30 WARN hdfs.DFSOutputStream: Failed: closeImpl, DFSStripedOutputStream:#1: isFailed? f, null@null java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:208) at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:146) at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:713) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize
[ https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537872#comment-14537872 ] Walter Su commented on HDFS-8220: - bq. Assertions should be used to check something that should never happen, while an exception should be used to check something that might happen. (from [When to use an assertion and when to use an exception|http://stackoverflow.com/questions/1957645/when-to-use-an-assertion-and-when-to-use-an-exception] ) Here, in this issue, We can't know lsb.getLocations().length beforehand, right? Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize --- Key: HDFS-8220 URL: https://issues.apache.org/jira/browse/HDFS-8220 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to validate the available datanodes against the {{BlockGroupSize}}. Please see the exception to understand more: {code} 2015-04-22 14:56:11,313 WARN hdfs.DFSClient (DataStreamer.java:run(538)) - DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) 2015-04-22 14:56:11,313 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387 java.io.IOException: DataStreamer Exception: at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) ... 1 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize
[ https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539185#comment-14539185 ] Walter Su commented on HDFS-8220: - Minor tips. I don't get expected Exception running test. I think cluster.waitActive() doesn't wait for already started DNs. So restarted NN has zero DN when file is created. Could you triger a heartbeats manually after restart NN? Thanks for your work. Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize --- Key: HDFS-8220 URL: https://issues.apache.org/jira/browse/HDFS-8220 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to validate the available datanodes against the {{BlockGroupSize}}. Please see the exception to understand more: {code} 2015-04-22 14:56:11,313 WARN hdfs.DFSClient (DataStreamer.java:run(538)) - DataStreamer Exception java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) 2015-04-22 14:56:11,313 INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387 java.io.IOException: DataStreamer Exception: at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544) at org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1) Caused by: java.lang.NullPointerException at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374) at org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424) ... 1 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8364) Erasure coding: fix some minor bugs in EC CLI
Walter Su created HDFS-8364: --- Summary: Erasure coding: fix some minor bugs in EC CLI Key: HDFS-8364 URL: https://issues.apache.org/jira/browse/HDFS-8364 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Priority: Minor issue 1: it's not user-friendly. We should tell user that `hdfs erasurecode` is the entrance to EC commands(HDFS-7349) {noformat} # bin/hdfs Usage: hdfs [--config confdir] [--daemon (start|stop|status)] [--loglevel loglevel] COMMAND where COMMAND is one of: balancer run a cluster balancing utility cacheadmin configure the HDFS cache crypto configure HDFS encryption zones ... zkfc run the ZK Failover Controller daemon {noformat} issue 2: it's not user-friendly {noformat} # bin/hdfs erasurecode -getZoneInfo / -getZoneInfo: Fatal internal error java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.erasurecode.ECCommand$GetECZoneInfoCommand.processPath(ECCommand.java:168) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hdfs.tools.erasurecode.ECCli.main(ECCli.java:45) {noformat} possible issue 3: {noformat} # bin/hdfs erasurecode -createZone / createZone: Unable to create EC zone for the path hdfs://datasight-187:9000/ {noformat} But we can create / zone by API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Status: Patch Available (was: Open) Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8342) Erasure Coding: test skip in TestDFSStripedInputStream
[ https://issues.apache.org/jira/browse/HDFS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8342: Attachment: HDFS-8342-HDFS-7285.001.patch Erasure Coding: test skip in TestDFSStripedInputStream -- Key: HDFS-8342 URL: https://issues.apache.org/jira/browse/HDFS-8342 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8342-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8352) Erasure Coding: test webhdfs read write stripe file
[ https://issues.apache.org/jira/browse/HDFS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8352: Status: Patch Available (was: Open) Erasure Coding: test webhdfs read write stripe file --- Key: HDFS-8352 URL: https://issues.apache.org/jira/browse/HDFS-8352 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8352-HDFS-7285.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)