[ 
https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210790#comment-16210790
 ] 

SammiChen commented on HDFS-12682:
----------------------------------

Hi [~xiaochen],  thanks for reporting this issue. Inspired by your discovery, I 
found the same issue exists in system EC persist into and load from fsImage 
(HFDS-12686).  The current convertErasureCodingPolicy function is perfect in 
most cases. For special cases, like get all erasure coding policy and persist 
policy into fsImage, I think we need a new edition for full convert. 

{quote}
The problem I see from HDFS-12258's implementation though, is the mutable ECPS 
is saved on the immutable ECP, breaking assumptions such as shared single 
instance policy. At the same time the policy is still not persisted 
independently. I think ECPS is highly dependent on the missing piece from 
HDFS-7337: policies are not persisted to NN metadata. The state of whether a 
policy is enabled could be persisted together with the policy, without 
impacting HDFSFileStatus.
{quote}
Persist ec policies is implemented in HDFS-7337. 

{quote}
I think this bug (HDFS-12682) and HDFS-12258 would make more sense if we could 
first persist policies to NN metadata. Would also be helpful to separate out 
something like ErasureCodingPolicyAndState for the policy-specific APIs, so the 
state isn't deserialized onto HDFSFileStatus.
{quote}
For HDFS-12258, [~zhouwei],  [~drankye] and I, we discussed and do have two 
different approaches when we first think about how to implement it. One is the 
current implemented approach, which add one extra "state" field in the existing 
ECP definition. Another is define a new class, something like 
{{ErasureCodingPolicyWithState}} to hold the EPC and new policy state field.  
They are almost equally good.  The only concern is if we introduce the new 
{{ErasureCodingPolicyWithState}}, it may introduce complexity to API 
interfaces, and to end users. There are multiple EC related APIs.  If we return 
 {{ErasureCodingPolicyWithState}} for {{getAllErasureCodingPolicies}} , should 
we return {{ErasureCodingPolicyWithState}} or {{ErasureCodingPolicy}} for 
{{getErasureCodingPolicy}}? something like that. Also is it worth to introduce 
a new class definition in Hadoop which only has 1 extra new field?   After all 
the considerations, the current approach is chosen to leverage the existing 
ECP. 

Please let me know if you have other concerns.  Thanks!

> ECAdmin -listPolicies will always show policy state as DISABLED
> ---------------------------------------------------------------
>
>                 Key: HDFS-12682
>                 URL: https://issues.apache.org/jira/browse/HDFS-12682
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>              Labels: hdfs-ec-3.0-must-do
>
> On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as 
> DISABLED.
> {noformat}
> [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3, State=DISABLED]
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED]
> [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec
> XOR-2-1-1024k
> {noformat}
> This is because when [deserializing 
> protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942],
>  the static instance of [SystemErasureCodingPolicies 
> class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101]
>  is first checked, and always returns the cached policy objects, which are 
> created by default with state=DISABLED.
> All the existing unit tests pass, because that static instance that the 
> client (e.g. ECAdmin) reads in unit test is updated by NN. :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to