[
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864318#comment-15864318
]
Andrew Wang edited comment on HDFS-7859 at 2/14/17 10:47 PM:
-------------------------------------------------------------
I thought about this JIRA some more, and had two questions I wanted to bring up
for discussion:
h3. Do we need a system default EC policy?
AFAICT, the system default policy dates from when we only supported a single
policy for HDFS. Now, we've pretty clearly defined the API for EC policies, and
for most uses, the EC policy is automatically inherited from a dir-level
policy. The {{setErasureCodingPolicy}} API already requires an EC policy to be
specified, so I think the default EC policy is basically vestigal and can be
removed.
h3. Can we use configuration instead of persistence for the set of enabled
policies?
I'm wondering if there is actually any benefit to persisting the set of allowed
policies. In the past, we've enabled and disabled features via configuration
keys, and this is basically the same idea. There's no danger of data corruption
from two NNs having different sets of enabled policies, so it's safe in that
sense. IMO we have a key like {{dfs.namenode.erasure.coding.policies.enabled}}
and specify from the list of hardcoded policies there.
If the above sounds good, I can file a new JIRA for refactoring out the system
default policies, and do the configuration key over on HDFS-11314.
was (Author: andrew.wang):
I thought about this JIRA some more, and had two questions I wanted to bring up
for discussion:
h3. Do we need a system default EC policy?
AFAICT, the system default policy dates from when we only supported a single
policy for HDFS. Now, we've pretty clearly defined the API for EC policies, and
for most uses, the EC policy is automatically inherited from a dir-level
policy. The {{setErasureCodingPolicy}} API already requires an EC policy to be
specified, so I think the default EC policy is basically vestigal and can be
removed.
# Can we use configuration instead of persistence for the set of enabled
policies?
I'm wondering if there is actually any benefit to persisting the set of allowed
policies. In the past, we've enabled and disabled features via configuration
keys, and this is basically the same idea. There's no danger of data corruption
from two NNs having different sets of enabled policies, so it's safe in that
sense. IMO we have a key like {{dfs.namenode.erasure.coding.policies.enabled}}
and specify from the list of hardcoded policies there.
If the above sounds good, I can file a new JIRA for refactoring out the system
default policies, and do the configuration key over on HDFS-11314.
> Erasure Coding: Persist erasure coding policies in NameNode
> -----------------------------------------------------------
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: Andrew Wang
> Priority: Blocker
> Labels: BB2015-05-TBR, hdfs-ec-3.0-must-do
> Attachments: HDFS-7859.001.patch, HDFS-7859.002.patch,
> HDFS-7859.004.patch, HDFS-7859.005.patch, HDFS-7859.006.patch,
> HDFS-7859.007.patch, HDFS-7859.008.patch, HDFS-7859.009.patch,
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch,
> HDFS-7859-HDFS-7285.003.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we
> persist EC schemas in NameNode centrally and reliably, so that EC zones can
> reference them by name efficiently.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]