[
https://issues.apache.org/jira/browse/HDFS-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145738#comment-14145738
]
Andrew Wang commented on HDFS-7081:
-----------------------------------
bq. If we can set storage policy directly on a directory, why do we still need
to do it recursively? But to provide a tool for easier administration (not just
for setting storage policy) is always good.
This is related to my question about renames. I could see an admin wanting to
know that everything in a subtree uses some storage policy. However, if a file
already has a policy set and is renamed underneath this subtree, the subtree's
policy won't apply. A recursive tool could be used to satisfy this usecase.
As one data point, I know Hive uses a temp dir during query processing and
renames things in and out.
I'm still hoping we can avoid this rename ambiguity though, since it'd make
management simpler. If we need per-file granularity, then I think my idea from
above would work. Basically, do not set UNSPECIFIED on files. At create time, a
files sets its storage policy either to an inherited parent policy, or the
default policy. Then rename will never change a file's policy.
bq. For this one I have a question. According to the current document "TRUSTED
namespace attributes are only visible and accessible to privileged users."
Currently the storage policy is actually set by superuser and in HDFS we do not
have root user. So does that mean we should use trusted here?
TRUSTED and USER are meant to be used by end user applications. The idea is
that apps can stash whatever app data they want in those xattr namespaces and
not worry about name collisions (except from other apps). For HDFS developers
who want to leverage xattr storage for a feature, an internal namespace like
system is more appropriate so as not to pollute the user namespaces. As we're
doing in this JIRA, the additional data can be exposed to users via some new
API, rather than through getXAttrs.
As to the rest, I'll just trust you and Nic. I'm not sure I'll have time to
review more this week, so we can just do follow-ons. Thanks guys.
> Add new DistributedFileSystem API for getting all the existing storage
> policies
> -------------------------------------------------------------------------------
>
> Key: HDFS-7081
> URL: https://issues.apache.org/jira/browse/HDFS-7081
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: balancer, namenode
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-7081.000.patch, HDFS-7081.001.patch,
> HDFS-7081.002.patch, HDFS-7081.003.patch
>
>
> Instead of loading all the policies from a client side configuration file, it
> may be better to provide Mover with a new RPC call for getting all the
> storage policies from the namenode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)