[
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110316#comment-16110316
]
Xiao Chen commented on HDFS-10285:
----------------------------------
Hi [~umamaheswararao] and all,
This looks to be a cool feature, thanks for working on this!
I partially reviewed the NN parts, and have some implementation level questions
/ comments:
- In {{FSDirSatisfyStoragePolicyOp}}, we keep a temporary list of INodes
{{candidateNodes}}. If satisfyStoragePolicy is called on a big dir (e.g.
'{{/}}'), we could end up holding all INodes, right? This may be a problem if
NN doesn't have the extra heap room. I see there's a TODO about labeling the
dir - suggest to optimize about memory here too.
- The new {{removeXattr}} method in FSN appears to be not audit logged. Is that
intentional?
- {{BlockManager}}, {{haEnabled}} is added. But currently other places in the
same class checks that from config. {{HAUtil.isHAEnabled(conf, nsId)}}.
- {{BlockManager}}, if {{(storagePolicyEnabled && spsEnabled) == false}}, we
print a WARN log. I think if neither of them is enabled - which would be the
default if we agree on [~andrew.wang]'s suggestion - it should not print a WARN.
> Storage Policy Satisfier in Namenode
> ------------------------------------
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Affects Versions: HDFS-10285
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch,
> HDFS-10285-consolidated-merge-patch-01.patch,
> HDFS-SPS-TestReport-20170708.pdf,
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf,
> Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These
> policies can be set on directory/file to specify the user preference, where
> to store the physical block. When user set the storage policy before writing
> data, then the blocks could take advantage of storage policy preferences and
> stores physical block accordingly.
> If user set the storage policy after writing and completing the file, then
> the blocks would have been written with default storage policy (nothing but
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such
> file names as a list. In some distributed system scenarios (ex: HBase) it
> would be difficult to collect all the files and run the tool as different
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage
> policy file (inherited policy from parent directory) to another storage
> policy effected directory, it will not copy inherited storage policy from
> source. So it will take effect from destination file/dir parent storage
> policy. This rename operation is just a metadata change in Namenode. The
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for
> admins from distributed nodes(ex: region servers) and running the Mover tool.
> Here the proposal is to provide an API from Namenode itself for trigger the
> storage policy satisfaction. A Daemon thread inside Namenode should track
> such calls and process to DN as movement commands.
> Will post the detailed design thoughts document soon.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]