[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544498#comment-14544498
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
-------------------------------------------


- Question: Do we need {{ErasureCodingResult}}s when we support multiple 
{{ECSchema}}s?

- Some suggestion on the terms:
||Replication||Erasure Coding||
| block | block group |
| replica | ec-block |
| UNDER MIN REPL'D BLOCKS | UNRECOVERABLE BLOCK GROUPS |
| DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY | MIN REQUIRED EC BLOCK# |
| Minimally replicated blocks | Minimally erasure-coded block groups |
| Over-replicated blocks | Over-erasure-coded block groups |
| Under-replicated blocks | Under-erasure-coded block groups |
| Mis-replicated blocks | Unsatisfactory placement block groups |
| Default replication factor | Default schema |
| Average block replication | Average block group size |
| Missing replicas | Missing ec-blocks |
| Decommissioned Replicas | Decommissioned ec-blocks |
| Decommissioning Replicas | Decommissioning ec-blocks |

- It is good to add two new classes ReplicationResult and ErasureCodingResult.  
Then, we can rename AbstractResult back to Result.
- minReplication should remain final.  The subclasses can initialize it by 
super constructor, i.e.
{code}
  static abstract class Result {
    ...

    final int minReplication;
    
    Result(int minReplication) {
      this.minReplication = minReplication;
    }

    ...
  }

  @VisibleForTesting
  static class ReplicationResult extends Result {
    final short replication;

    ReplicationResult(Configuration conf) {
      super(conf.getInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY,
                        DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_DEFAULT));
      this.replication = (short)conf.getInt(DFSConfigKeys.DFS_REPLICATION_KEY,
                                            
DFSConfigKeys.DFS_REPLICATION_DEFAULT);
    }
    ...
  }

  @VisibleForTesting
  static class ErasureCodingResult extends Result {
    final String ecSchema;

    ErasureCodingResult(Configuration conf) {
      this(ErasureCodingSchemaManager.getSystemDefaultSchema());
    }

    ErasureCodingResult(ECSchema ecSchema) {
      super(ecSchema.getNumDataUnits());
      this.ecSchema = ecSchema.getSchemaName();
    }

    ...
  }
{code}
- The check method can be simplified as below:
{code}
    final Result r = file.getReplication() == 0? ecRes: replRes; 
    collectFileSummary(path, file, r, blocks);
    if (showprogress && (replRes.totalFiles + ecRes.totalFiles) % 100 == 0) {
      out.println();
      out.flush();
    }
    collectBlocksSummary(parent, file, r, blocks);
{code}


> Change fsck to support EC files
> -------------------------------
>
>                 Key: HDFS-7687
>                 URL: https://issues.apache.org/jira/browse/HDFS-7687
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Takanobu Asanuma
>         Attachments: HDFS-7687.1.patch, HDFS-7687.2.patch, HDFS-7687.3.patch
>
>
> We need to change fsck so that it can detect "under replicated" and corrupted 
> EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to