[ 
https://issues.apache.org/jira/browse/HDFS-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141690#comment-14141690
 ] 

Andrew Wang commented on HDFS-7081:
-----------------------------------

Hey guys, this isn't a full review, but I wanted to just get this stuff out 
there so we have the weekend to mull it over. This is my first time looking 
through this code in detail, so some of my comments extend beyond the immediate 
scope of this patch. I've tried to split it up accordingly, so we can do some 
of this in a follow-on.

High-level:
* Any reason we don't also expose java APIs for setting/getting the policy in 
HdfsAdmin? Need to have public classes then too.
* Does DFSAdmin have a way of getting the list of storage policies? Not sure 
how you'd know what policy to set if you don't know what's available.
* Also a bit surprised that the API is string based rather than ID based. 
Making it ID based would be more efficient and also allow renaming policies if 
desired.

FSNamesystem:
* Need javadoc on new getter

Not related to this patch:

Dir-level storage policies:
* The handling of directory storage policies seems inconsistent with rename. If 
you rename an UNSPECIFIED file, it will pick up a potentially different storage 
ID from its destination. However, if it already has one set (e.g. it was 
created under a dir with a set policy), it'll keep that policy. Have you 
considered always setting the file's storage policy to the default? Then rename 
would be consistent. The other alternative is doing away with per-file storage 
policies and *only* specifying them on a directory. This also seems reasonable 
to me from a management perspective.
* This seems worth documenting at the least.
* We should also consider having tooling that can do a recursive 
setStoragePolicy for easier administration.

BlockStoragePolicySuite:
* In the class javadoc, "Suite" by itself doesn't mean much to me. Could we 
mention "collection of storage policies"?
* The xattr name "bsp" is not very descriptive. Maybe 
"hsm.block.storage.policy.id" instead? We dedupe the names in memory, so the 
size of the name only really matters when it's serialized to the fsimage. I 
think we could dedupe them there too with a little work.
* We also shouldn't be using the trusted namespace, since it pollutes a 
namespace that's meant for use only by the root user. I'd recommend system 
instead.
* BlockStoragePolicy#readBlockStorageSuite(conf) doesn't seem to be used, could 
remove some other helper functions then too
* Would prefer if we didn't initialize DEFAULT_SUITE until it's needed, avoid 
some static init cost. Can be straight up removed if you remove 
readBlockStorageSuite(conf).
* Config keys belong in DFSConfigKeys.java
* Has some lines longer than 80 chars

BlockStoragePolicy:
* Should be using slf4j for new logging
* I'd strongly consider a builder pattern for this constructor, since it seems 
likely/possible that we'll want to add more fields here.
* {{diff}}: I'm not sure what a "list difference" is, but it's not a set 
difference since it doesn't handle duplicates. Update javadoc about the 
expectations?

> Add new DistributedFileSystem API for getting all the existing storage 
> policies
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-7081
>                 URL: https://issues.apache.org/jira/browse/HDFS-7081
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer, namenode
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-7081.000.patch, HDFS-7081.001.patch, 
> HDFS-7081.002.patch
>
>
> Instead of loading all the policies from a client side configuration file, it 
> may be better to provide Mover with a new RPC call for getting all the 
> storage policies from the namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to