[
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279113#comment-16279113
]
Daryn Sharp commented on HDFS-10285:
------------------------------------
I lost my initial cursory notes on bugs in this patch. I was immediately
pondering if this feature really did belong in the NN but was going to overlook
it for initial review. I'm glad others raised the question.
My preference is this feature, like all scan features, should be outside the
NN. Integrated functionality is arguably more user-friendly but it comes with
its own costs. Namely increased complexity and maintenance. It's yet another
feature to accommodate in future core features.
There are many basic issues with integrated scan features. Like truly being
able to reconfigure on the fly. Capability to run on a precisely scheduled
basis. Likewise being able to immediately and definitively kill it if it's
causing problems or the cluster is under unusual distress. Or being able to
iteratively test new versions w/o bouncing the standby with the new version,
failing over, failing back if not working as intended. An adjunct service does
not have these issues.
That said: _the cited issues with the balancer are actually a plus for me_. I
don't love the balancer itself but I love it being a separate service.
I need to see exactly what sort of rpc calls would be necessary for it to be
feasible as a separate service. So long as it's a cheap read load, the NN can
handle at least 80k ops/sec (audit logged), upwards of 300k ops/sec (if not
audit logged).
> Storage Policy Satisfier in Namenode
> ------------------------------------
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Affects Versions: HDFS-10285
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch,
> HDFS-10285-consolidated-merge-patch-01.patch,
> HDFS-10285-consolidated-merge-patch-02.patch,
> HDFS-10285-consolidated-merge-patch-03.patch,
> HDFS-SPS-TestReport-20170708.pdf,
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf,
> Storage-Policy-Satisfier-in-HDFS-May10.pdf,
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These
> policies can be set on directory/file to specify the user preference, where
> to store the physical block. When user set the storage policy before writing
> data, then the blocks could take advantage of storage policy preferences and
> stores physical block accordingly.
> If user set the storage policy after writing and completing the file, then
> the blocks would have been written with default storage policy (nothing but
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such
> file names as a list. In some distributed system scenarios (ex: HBase) it
> would be difficult to collect all the files and run the tool as different
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage
> policy file (inherited policy from parent directory) to another storage
> policy effected directory, it will not copy inherited storage policy from
> source. So it will take effect from destination file/dir parent storage
> policy. This rename operation is just a metadata change in Namenode. The
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for
> admins from distributed nodes(ex: region servers) and running the Mover tool.
> Here the proposal is to provide an API from Namenode itself for trigger the
> storage policy satisfaction. A Daemon thread inside Namenode should track
> such calls and process to DN as movement commands.
> Will post the detailed design thoughts document soon.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]