[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271960#comment-14271960
 ] 

Kai Zheng commented on HDFS-7337:
---------------------------------

Continued.

bq. Logically, BlockGroup is applicable even without EC, because striping can 
be done without EC. So an alternative is to put it in the protocol package.
Good point! I agree we need to decouple. EC does need something like 
ECBlockGroup that can be derived from BlockGroup. Maybe we need better name for 
such.

bq. I don't think we should reference the schema through a name (since it 
wastes space and is fragile).
I agree and will investigate further.

bq. It's great that we are considering LRC in advance. However, with LEGAL-211 
pending, I suggest we keep BlockGroup simpler for now. For example, it can 
contain only dataBlocks and parityBlocks. When we implement LRC we can subclass 
or extend it.
Good point. Let me try how it can be simplified. Basically you're right only 
data blocks and parity blocks are needed I guess in whatever code algorithm. 
ECManager only needs to provide an array of data blocks as sources and an array 
of parity blocks as placeholders in addition to a block group id to create a 
BlockGroup. As to how these blocks are organized/ordered is specific to the 
codec and can be hidden from outside. So actually the SubBlockGroup stuff is 
only for the codec framework. Sure I will make it internal avoiding polluting 
the public API.

> Configurable and pluggable Erasure Codec and schema
> ---------------------------------------------------
>
>                 Key: HDFS-7337
>                 URL: https://issues.apache.org/jira/browse/HDFS-7337
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Kai Zheng
>         Attachments: HDFS-7337-prototype-v1.patch, 
> HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
> PluggableErasureCodec.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to