[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359619#comment-14359619
 ] 

Zhe Zhang commented on HDFS-7337:
---------------------------------

Thanks Kai for the update! The design looks good to me overall.

I also took the chance to look at {{ErasureCodec}} and {{ECSchema}} again. 
IIUC, {{ErasureCodec}} is like a factory or an utility class, which creates 
{{ErasureCoder}} and {{BlockGrouper}} based on {{ECSchema}}. 

If that's the case, I think we can leverage the pattern of 
{{BlockStoragePolicySuite}}. Something like:
{code}
public static ECSchemaSuite createDefaultSuite() {
    final ECSchema[] schemas =
        new ECSchema[2];
    final byte RS63 = HdfsConstants.RS63_EC_SCHEMA_ID;
    policies[RS63] = new ECSchema(RS63,
        HdfsConstants.RS63_EC_SCHEMA_NAME,
        HdfsConstants.RS_EC_ALGORITHM_ID,
        6, 3, chunkSize);
    final byte XOR21 = HdfsConstants.XOR21_EC_SCHEMA_ID;
    policies[XOR21] = new ECSchema(XOR21,
        HdfsConstants.XOR21_EC_SCHEMA_NAME,
        HdfsConstants.XOR_EC_ALGORITHM_ID,
        2, 1, chunkSize);
  }
{code}

Then NN can just pass around the schema ID when communicating with DN and 
client, which is much smaller than an {{ErasureCodec}} object.

> Configurable and pluggable Erasure Codec and schema
> ---------------------------------------------------
>
>                 Key: HDFS-7337
>                 URL: https://issues.apache.org/jira/browse/HDFS-7337
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Kai Zheng
>         Attachments: HDFS-7337-prototype-v1.patch, 
> HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
> PluggableErasureCodec-v2.pdf, PluggableErasureCodec.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to