[
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280634#comment-16280634
]
Anu Engineer edited comment on HDFS-10285 at 12/6/17 6:36 PM:
--------------------------------------------------------------
@[~andrew.wang] Thanks for the comments.
bq. Adding a new service requires adding support in management frameworks like
Cloudera Manager or Ambari. This means support for deployment, configuration,
monitoring, rolling upgrade, and log collection.
I am not very familiar with these tools; I prefer to deploy my clusters without
these tools. So help me here a bit, are you suggesting that we should decide if
a feature should be inside Namenode or not, based on how inflexible these tools
are? Why is it so hard for say, (I am just presuming that you will be more
familiar with Cloudera Manager) Cloudera Manager to configure a new service?
Isn't the sole purpose of these tools to do this kind of management actions?
I am hopeful(again my understanding of these tools are minimal) that these
tools already have all the requisite framework in place, and it is not as
onerous as you describe to support a daemon that is running in the cluster.
IMHO, if we base the decisions on what feature should go into namenode based on
the code modification complexity of these tools, I am worried that we are
putting an unusually complex burden on Namenode.
I suggest that we should do what is the right thing for namenode based on the
constraints of our layer and not bother about layers far above us.
@[~vinayrpet] Thank you for sharing your perspective.
bq. Im coming at this from the standpoint of supporting Cloudera's Hadoop
customers.
Since I work for Hortonworks, I have a wealth of perspective on how customers
tend to use these features. Most customers will start off with this tool as is,
then they will discover that queue length is not adequate for the move to
happen in a reasonable time, they will increase the queue length and then we
will discover that Namenode is running out of memory. Next step, is that they
will want us to run SPS based on various policies, like move the blocks if the
blocks are older than 3 hours, or if the load on Namenode is less than x, of
the number of YARN containers in the cluster is less than X.
Slowly but steadily, customers will want complex policies.
Here is the kicker, if SPS is inside namenode each time some feature is added
we are going to step into this huge argument whether we should have these
complex features inside namenode.
So experience from Hortonworks customers tells me that we should prioritize
scale and future needs of this feature rather than ease of code change for
management tools.
bq. IIUC, Main concerns to keep SPS (or any other such feature) in NameNode are
following.
I think you missed a critical argument, all scan and move functions of HDFS
today is outside Namenode. I am proposing that we keep it that way. SPS is not
unique in any way, and we have a well-known pattern that works. In my mind,
management tools like Ambari should be able to address the ease of use part.
For people like me who are willing to use the shell, this does not seem to be
an additional burden.
bq. 1. Locking Load
This same process can be done from outside namenode. Hence we are proposing
that we move it outside.
bq. SPS should have client facing RPC server to learn about the paths to
satisfy policy. This comes with lot of deployment overhead as already mentioned
above by Andrew.
I seriously question this assertion. From a shell perspective, we can check if
this config value is set and start this daemon from start-dfs.sh. Why is this
such a complicated task for Cloudera Manager or Ambari? I do not buy this
argument. How can something that can be done in 5 lines of code in Hadoop,
become a complex task that we would want to avoid that code path in Cloudera
Manager? I am sorry, That makes no sense to me.
bq. if SPS doesnt have its own RPC server, then it needs to scan the targets by
checking for xattr recursively from root( / ) directory
What prevents us from adding this? We should do what is technically required.
The problem I think you are missing is that the current SPS has no policy
control of when it should run. But I posit that it is not too far off, that we
will have to build various kinds of policies to control it. I am not suggesting
that we need to do that before the merge. Being an independent service allows
for this kind of flexibility.
bq. Memory
This is the most critical concern that I have. In one of the discussions with
SPS developers, they pointed out to me that they want to make sure an SPS move
happens within a reasonable time. Apparently, I was told that this is a
requirement from HBase. If you have such a need, then the first thing an admin
will do is to increase this queue size. Slowly, but steadily SPS will eat into
more and more memory of Namenode. In fact, you bring up a good point, if SPS is
a service with its RPC and is an independent service, maybe this xattr that we
are going to maintain inside Namenode can be moved out too. I am open to that
suggestion. I was keying off the current design and did not reflect upon this
too deeply. Thanks for bringing this to my notice.
bq. CPU
Agree, CPU is not a concern for me too.
bq. Impact of code
This is also not a concern that I have.
To summarize:
# I don't agree with the point that running SPS as an independent service is a
complex task. My data point is the changes that I need to make in
*start-dfs.sh*. I will be glad to post a patch that illustrates that it is not
very complicated.
# If the core reason for a putting a feature into namenode is less work for
management tools, I submit that it is a wrong rationale.
# We have an existing pattern Balancer, Mover, DiskBalancer where we have the
"scan and move tools" as an external feature to namenode. I am not able to see
any convincing reason for breaking this pattern.
# A feature like SPS in its current form is far from complete. There will be
features that are added (as it should be) and SPS being an independent service
will allow us to iterate more quickly. Most importantly, when we add feature or
policy in SPS we will not need to have this discussion of how it is impacting
Namenode.
was (Author: anu):
@[~andrew.wang] Thanks for the comments.
bq. Adding a new service requires adding support in management frameworks like
Cloudera Manager or Ambari. This means support for deployment, configuration,
monitoring, rolling upgrade, and log collection.
I am not very familiar with these tools; I prefer to deploy my clusters without
these tools. So help me here a bit, are you suggesting that we should decide if
a feature should be inside Namenode or not, based on how inflexible these tools
are? Why is it so hard for say, (I am just presuming that you will be more
familiar with Cloudera Manager) Cloudera Manager to configure a new service?
Isn't the sole purpose of these tools to do this kind of management actions?
I am hopeful(again my understanding of these tools are minimal) that these
tools already have all the requisite framework in place, and it is not as
onerous as you describe to support a daemon that is running in the cluster.
IMHO, if we base the decisions on what feature should go into namenode based on
the code modification complexity of these tools, I am worried that we are
putting an unusually complex burden on Namenode.
I suggest that we should do what is the right thing for namenode based on the
constraints of our layer and not bother about layers far above us.
@[~vinayrpet] Thank you for sharing your perspective.
bq. Im coming at this from the standpoint of supporting Cloudera's Hadoop
customers.
Since I work for Hortonworks, I have a wealth of perspective on how customers
tend to use these features. Most customers will start off with this tool as is,
then they will discover that queue length is not adequate for the move to
happen in a reasonable time, they will increase the queue length and then we
will discover that Namenode is running out of memory. Next step, is that they
will want us to run SPS based on various policies, like move is the blocks if
the blocks are older than 3 hours, or if the load on Namenode is less than x,
of the number of YARN containers in the cluster is less than X.
Slowly but steadily, customers will want complex policies.
Here is the kicker, if SPS is inside namenode each time some feature is added
we are going to step into this huge argument whether we should have these
complex features inside namenode.
So experience from Hortonworks customers tells me that we should prioritize
scale and future needs of this feature rather than ease of code change for
management tools.
bq. IIUC, Main concerns to keep SPS (or any other such feature) in NameNode are
following.
I think you missed a critical argument, all scan and move functions of HDFS
today is outside Namenode. I am proposing that we keep it that way. SPS is not
unique in any way, and we have a well-known pattern that works. In my mind,
management tools like Ambari should be able to address the ease of use part.
For people like me who are willing to use the shell, this does not seem to be
an additional burden.
bq. 1. Locking Load
This same process can be done from outside namenode. Hence we are proposing
that we move it outside.
bq. SPS should have client facing RPC server to learn about the paths to
satisfy policy. This comes with lot of deployment overhead as already mentioned
above by Andrew.
I seriously question this assertion. From a shell perspective, we can check if
this config value is set and start this daemon from start-dfs.sh. Why is this
such a complicated task for Cloudera Manager or Ambari? I do not buy this
argument. How can something that can be done in 5 lines of code in Hadoop,
become a complex task that we would want to avoid that code path in Cloudera
Manager? I am sorry, That makes no sense to me.
bq. if SPS doesnt have its own RPC server, then it needs to scan the targets by
checking for xattr recursively from root( / ) directory
What prevents us from adding this? We should do what is technically required.
The problem I think you are missing is that the current SPS has no policy
control of when it should run. But I posit that it is not too far off, that we
will have to build various kinds of policies to control it. I am not suggesting
that we need to do that before the merge. Being an independent service allows
for this kind of flexibility.
bq. Memory
This is the most critical concern that I have. In one of the discussions with
SPS developers, they pointed out to me that they want to make sure an SPS move
happens within a reasonable time. Apparently, I was told that this is a
requirement from HBase. If you have such a need, then the first thing an admin
will do is to increase this queue size. Slowly, but steadily SPS will eat into
more and more memory of Namenode. In fact, you bring up a good point, if SPS is
a service with its RPC and is an independent service, maybe this xattr that we
are going to maintain inside Namenode can be moved out too. I am open to that
suggestion. I was keying off the current design and did not reflect upon this
too deeply. Thanks for bringing this to my notice.
bq. CPU
Agree, CPU is not a concern for me too.
bq. Impact of code
This is also not a concern that I have.
To summarize:
# I don't agree with the point that running SPS as an independent service is a
complex task. My data point is the changes that I need to make in
*start-dfs.sh*. I will be glad to post a patch that illustrates that it is not
very complicated.
# If the core reason for a putting a feature into namenode is less work for
management tools, I submit that it is a wrong rationale.
# We have an existing pattern Balancer, Mover, DiskBalancer where we have the
"scan and move tools" as an external feature to namenode. I am not able to see
any convincing reason for breaking this pattern.
# A feature like SPS in its current form is far from complete. There will be
features that are added (as it should be) and SPS being an independent service
will allow us to iterate more quickly. Most importantly, when we add feature or
policy in SPS we will not need to have this discussion of how it is impacting
Namenode.
> Storage Policy Satisfier in Namenode
> ------------------------------------
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, namenode
> Affects Versions: HDFS-10285
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch,
> HDFS-10285-consolidated-merge-patch-01.patch,
> HDFS-10285-consolidated-merge-patch-02.patch,
> HDFS-10285-consolidated-merge-patch-03.patch,
> HDFS-SPS-TestReport-20170708.pdf,
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf,
> Storage-Policy-Satisfier-in-HDFS-May10.pdf,
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These
> policies can be set on directory/file to specify the user preference, where
> to store the physical block. When user set the storage policy before writing
> data, then the blocks could take advantage of storage policy preferences and
> stores physical block accordingly.
> If user set the storage policy after writing and completing the file, then
> the blocks would have been written with default storage policy (nothing but
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such
> file names as a list. In some distributed system scenarios (ex: HBase) it
> would be difficult to collect all the files and run the tool as different
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage
> policy file (inherited policy from parent directory) to another storage
> policy effected directory, it will not copy inherited storage policy from
> source. So it will take effect from destination file/dir parent storage
> policy. This rename operation is just a metadata change in Namenode. The
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for
> admins from distributed nodes(ex: region servers) and running the Mover tool.
> Here the proposal is to provide an API from Namenode itself for trigger the
> storage policy satisfaction. A Daemon thread inside Namenode should track
> such calls and process to DN as movement commands.
> Will post the detailed design thoughts document soon.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]