[
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275094#comment-16275094
]
Anu Engineer commented on HDFS-10285:
-------------------------------------
bq. Could you describe this plan in more detail? ZK doesn't solve the problems
of HA by itself. We still need to think about idempotency. Does it require
ZKFCs?
Something similar to ZKFC like architecture would the easiest way to do this.
We are in the mode of crawl, walk, run. So getting the SPS running as a
separate service is what we are focused on now.
bq. All of this adds significant complexity to deploying this feature
>From a deployment point of view, all you will need to do is enable the SPS
>(which we have to do even now), and then SPS service can be automatically
>started.
bq. Does this involve rescanning a significant portion of the namespace?
Synchronizing state over an RPC boundary (which can fail) is also more
complicated than going in-memory.
I don't agree, the fact that we have a large number of applications working
against HDFS by reading the information from Namenode should be enough evidence
that SPS can simply be another application that works against Namenode. There
is no need for moving that application into Namenode.
bq. Is an edit log update on every block move? That would be a lot of overhead
The current move logic in HDFS -- This is not something done by SPS -- is such
that when a block is moved, it issues a block report with a hint to Namenode
which tells namenode which block to delete. So there is no extra overhead from
SPS.
bq. I don't agree with #1 for the reason stated above. The DiskBalancer is fine
since it's local to one DN, but the Balancer and Mover circumventing global
coordination is an anti-pattern IMO.
I disagree. It is just your opinion that existing code is bad. Do you have any
metric to prove that existing code is bad? if so, would you be kind enough to
share them?
Looks like the consistency of opinion is not a virtue that you share :),
https://issues.apache.org/jira/browse/HDFS-6382
Please look at the JIRA to see comments from lots of people, including *you* on
why a simple process external to Namenode seems like a good idea.
bq.Regarding #2, in my previous comment, I provided a number of tasks that are
performed by the SPS-in-NN.
Coming back to technology decision from personal opinions; The list of work
items that can be maintained NN can become large. Yes, we have introduced
throttling -- but that only cripples this feature.
The policies of how and when these blocks are moved and also the possibility of
making this a first-class HSM is too attractive to forgo. With that in mind,
staying inside Namenode is going to hamper the freedom and things this feature
can do. As I mentioned already several other features will benefit immensely
from this work.
bq. I don't follow how SSM or provided block storage benefit from SPS as a
service vs. being part of the NN. If there are design docs for these
interactions, I would appreciate some references.
I don't know if there are design documents on these topics yet, I have gleaned
most of this from conversations with other contributors.
> Storage Policy Satisfier in Namenode
> ------------------------------------
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Affects Versions: HDFS-10285
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch,
> HDFS-10285-consolidated-merge-patch-01.patch,
> HDFS-10285-consolidated-merge-patch-02.patch,
> HDFS-10285-consolidated-merge-patch-03.patch,
> HDFS-SPS-TestReport-20170708.pdf,
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf,
> Storage-Policy-Satisfier-in-HDFS-May10.pdf,
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These
> policies can be set on directory/file to specify the user preference, where
> to store the physical block. When user set the storage policy before writing
> data, then the blocks could take advantage of storage policy preferences and
> stores physical block accordingly.
> If user set the storage policy after writing and completing the file, then
> the blocks would have been written with default storage policy (nothing but
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such
> file names as a list. In some distributed system scenarios (ex: HBase) it
> would be difficult to collect all the files and run the tool as different
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage
> policy file (inherited policy from parent directory) to another storage
> policy effected directory, it will not copy inherited storage policy from
> source. So it will take effect from destination file/dir parent storage
> policy. This rename operation is just a metadata change in Namenode. The
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for
> admins from distributed nodes(ex: region servers) and running the Mover tool.
> Here the proposal is to provide an API from Namenode itself for trigger the
> storage policy satisfaction. A Daemon thread inside Namenode should track
> such calls and process to DN as movement commands.
> Will post the detailed design thoughts document soon.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]