[
https://issues.apache.org/jira/browse/HDFS-12405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16759395#comment-16759395
]
Ayush Saxena commented on HDFS-12405:
-------------------------------------
Thanx Everyone for the discussion.
bq. Why do we have to clean up the removed policies? I think NameNode's
restart is not frequently enough so clean up at that time can only cover a
little portion of policies, so clean up them when NameNode restart would
suffice?
Correct!!! I guess doing it at the time of namenode restart doesn't seem to
cover all and as said namenode restart isn't a frequent operation.
Regarding the cleanup. Current behavior according to me for removed policy is
just like a disabled policy which can't be enabled. Doesn't make sense to keep
it up like that. There is one more problem with keeping it. Presently we have a
limit to number of user defined policies that we can have which is 64 once
reached the threshold we can't add more. Even if we call remove policy still it
won't clean so we wont't be able to add any more. IMO we can add an extra
parameter -clean or say -force with remove policy which should clean up the
policy too while removing. I guess that should solve the above said bottleneck
problem too.
Regrading the check if some file is using that EC policy that is for sure an
heavy operation but if an admin wants to be sure before doing so can do it by
using ls command with -e option and grep the policy.
> Clean up removed erasure coding policies from namenode
> ------------------------------------------------------
>
> Key: HDFS-12405
> URL: https://issues.apache.org/jira/browse/HDFS-12405
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: erasure-coding
> Reporter: Sammi Chen
> Assignee: Huafeng Wang
> Priority: Major
> Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, when an erasure coding policy is removed, it's been transited to
> "removed" state. User cannot apply policy with "removed" state to
> file/directory anymore. The policy cannot be safely removed from the system
> unless we know there are no existing files or directories that use this
> "removed" policy. To find out whether there are files or directories which
> are using the policy is time consuming in runtime and might impact the
> Namenode performance. So a better choice is doing the work when NameNode
> restarts and loads Inodes. Collecting the information at that time will not
> introduce much extra overhead.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]