[
https://issues.apache.org/jira/browse/HDFS-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-16875:
-----------------------------
Attachment: Erasure Coding Access Proxy.pdf
> Erasure Coding: data access proxy to allow old clients to read EC data
> ----------------------------------------------------------------------
>
> Key: HDFS-16875
> URL: https://issues.apache.org/jira/browse/HDFS-16875
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: ec, erasure-coding
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Priority: Major
> Attachments: Erasure Coding Access Proxy.pdf
>
>
> Erasure Coding is only supported by Hadoop 3, while many production
> deployments still depend on Hadoop 2. Upgrading the whole data tech stack to
> the Hadoop 3 release may involve big migration efforts and even reliability
> risks, considering the incompatibilities between these two Hadoop major
> releases as well as the potential uncovered issues and risks hidden in newer
> releases. Therefore, we need to find a solution, with the least amount of
> migration effort and risk, to adopt Erasure Coding for cost efficiency but
> still allow HDFS clients with old versions (Hadoop 2.x) to access EC data in
> a transparent manner.
> Internally we have developed an EC access proxy which translates the EC data
> for old clients. We also extend the NameNode RPC so it can recognize HDFS
> clients with/without the EC support, and redirect the old clients to the
> proxy. With the proxy we set up separate Erasure Coding clusters storing
> hundreds of PB of data, while leaving other production clusters and all the
> upper layer applications untouched.
> Considering some changes are made at fundamental components of HDFS (e.g.,
> client-NN RPC header), we do not aim to merge the change to trunk. We will
> use this ticket to share the design and implementation details (including the
> code) and collect feedback. We may use a separate github repo to open source
> the implementation later.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]