[ 
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7339:
----------------------------
    Description: 
All erasure codec operations center around the concept of _block group_; they 
are formed in initial encoding and looked up in recoveries and conversions. A 
lightweight class {{BlockGroup}} is created to record the original and parity 
blocks in a coding group, as well as a pointer to the codec schema (pluggable 
codec schemas will be supported in HDFS-7337). With the striping layout, the 
HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
Therefore we propose to extend a file’s inode to switch between _contiguous_ 
and _striping_ modes, with the current mode recorded in a binary flag. An array 
of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
“traditional” HDFS files with contiguous block layout.

The NameNode creates and maintains {{BlockGroup}} instances through the new 
{{ECManager}} component; the attached figure has an illustration of the 
architecture. As a simple example, when a {_Striping+EC_} file is created and 
written to, it will serve requests from the client to allocate new 
{{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
{{BlockGroups}} are allocated both in initial online encoding and in the 
conversion from replication to EC. {{ECManager}} also facilitates the lookup of 
{{BlockGroup}} information for block recovery work.

  was:
All erasure codec operations center around the concept of _block groups_, which 
are formed in encoding and looked up in decoding. This JIRA creates a 
lightweight {{BlockGroup}} class to record the original and parity blocks in an 
encoding group, as well as a pointer to the codec schema. Pluggable codec 
schemas will be supported in HDFS-7337. 

The NameNode creates and maintains {{BlockGroup}} instances through 2 new 
components; the attached figure has an illustration of the architecture.

{{ECManager}}: This module manages {{BlockGroups}} and associated codec 
schemas. As a simple example, it stores the codec schema of Reed-Solomon 
algorithm with 3 original and 2 parity blocks (5 blocks in each group). Each 
{{BlockGroup}} points to the schema it uses. To facilitate lookups during 
recovery requests, {{BlockGroups}} should be oraganized as a map keyed by 
{{Blocks}}.

{{ErasureCodingBlocks}}: Block encoding work is triggered by multiple events. 
This module analyzes the incoming events, and dispatches tasks to 
{{UnderReplicatedBlocks}} to create parity blocks. A new queue 
({{QUEUE_INITIAL_ENCODING}}) will be added to the 5 existing priority queues to 
maintain the relative order of encoding and replication tasks.
* Whenever a block is finalized and meets EC criteria -- including 1) block 
size is full; 2) the file’s storage policy allows EC -- {{ErasureCodingBlocks}} 
tries to form a {{BlockGroup}}. In order to do so it needs to store a set of 
blocks waiting to be encoded. Different grouping algorithms can be applied -- 
e.g., always grouping blocks in the same file. Blocks in a group should also 
reside on different DataNodes, and ideally on different racks, to tolerate node 
and rack failures. If successful, it records the formed group with 
{{ECManager}} and insert the parity blocks into {{QUEUE_INITIAL_ENCODING}}.
* When a parity block or a raw block in {{ENCODED}} state is found missing, 
{{ErasureCodingBlocks}} adds it to existing priority queues in 
{{UnderReplicatedBlocks}}. E.g., if all parity blocks in a group are lost, they 
should be added to {{QUEUE_HIGHEST_PRIORITY}}. New priorities might be added 
for fine grained differentiation (e.g., loss of a raw block versus a parity 
one).


> Create block groups for initial block encoding
> ----------------------------------------------
>
>                 Key: HDFS-7339
>                 URL: https://issues.apache.org/jira/browse/HDFS-7339
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: Encoding-design-NN.jpg, HDFS-7339-001.patch
>
>
> All erasure codec operations center around the concept of _block group_; they 
> are formed in initial encoding and looked up in recoveries and conversions. A 
> lightweight class {{BlockGroup}} is created to record the original and parity 
> blocks in a coding group, as well as a pointer to the codec schema (pluggable 
> codec schemas will be supported in HDFS-7337). With the striping layout, the 
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. 
> Therefore we propose to extend a file’s inode to switch between _contiguous_ 
> and _striping_ modes, with the current mode recorded in a binary flag. An 
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for 
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new 
> {{ECManager}} component; the attached figure has an illustration of the 
> architecture. As a simple example, when a {_Striping+EC_} file is created and 
> written to, it will serve requests from the client to allocate new 
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, 
> {{BlockGroups}} are allocated both in initial online encoding and in the 
> conversion from replication to EC. {{ECManager}} also facilitates the lookup 
> of {{BlockGroup}} information for block recovery work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to