Regarding Hadoop Erasure Coding architecture

Chaitanya M V S Mon, 11 Jun 2018 09:52:08 -0700

Hi!

We a group of people trying to understand the architecture of erasure
coding in Hadoop 3.0. We have been facing difficulties to understand few
terms and concepts regarding the same.


1. What do the terms Block, Block Group, Stripe, Cell and Chunk mean in the
context of erasure coding (these terms have taken different meanings and
have been used interchangeably over various documentation and blogs)? How
has this been incorporated in reading and writing of EC data?

2. How has been the idea/concept of the block from previous versions
carried over to EC?

3. ‎The higher level APIs, that of ErasureCoders and ErasureCodec still
hasn't been plugged into Hadoop. Also, I haven't found any new Jira
regarding the same. Can I know if there are any updates or pointers
regarding the incorporation of these APIs into Hadoop?

4. How is the datanode for reconstruction work chosen?  Also, how are the
buffer sizes for the reconstruction work determined?


Thanks in advance for your time and considerations.

Regards,
M.V.S.Chaitanya

Regarding Hadoop Erasure Coding architecture

Reply via email to