[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

Anu Engineer (JIRA) Wed, 06 Dec 2017 10:37:29 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280634#comment-16280634
 ]


Anu Engineer edited comment on HDFS-10285 at 12/6/17 6:36 PM:
--------------------------------------------------------------

@[~andrew.wang] Thanks for the comments.
bq. Adding a new service requires adding support in management frameworks like 
Cloudera Manager or Ambari. This means support for deployment, configuration, 
monitoring, rolling upgrade, and log collection. 

I am not very familiar with these tools; I prefer to deploy my clusters without 
these tools. So help me here a bit, are you suggesting that we should decide if 
a feature should be inside Namenode or not, based on how inflexible these tools 
are? Why is it so hard for say, (I am just presuming that you will be more 
familiar with Cloudera Manager) Cloudera Manager to configure a new service? 
Isn't the sole purpose of these tools to do this kind of management actions? 

I am hopeful(again my understanding of these tools are minimal) that these 
tools already have all the requisite framework in place, and it is not as 
onerous as you describe to support a daemon that is running in the cluster.

IMHO, if we base the decisions on what feature should go into namenode based on 
the code modification complexity of these tools, I am worried that we are 
putting an unusually complex burden on Namenode.

I suggest that we should do what is the right thing for namenode based on the 
constraints of our layer and not bother about layers far above us.

@[~vinayrpet] Thank you for sharing your perspective. 

bq. Im coming at this from the standpoint of supporting Cloudera's Hadoop 
customers. 

Since I work for Hortonworks, I have a wealth of perspective on how customers 
tend to use these features. Most customers will start off with this tool as is, 
then they will discover that queue length is not adequate for the move to 
happen in a reasonable time, they will increase the queue length and then we  
will discover that Namenode is running out of memory. Next step, is that they 
will want us to run SPS based on various policies, like move  the blocks if the 
blocks are older than 3 hours, or if the load on Namenode is less than x, of 
the number of YARN containers in the cluster is less than X. 

Slowly but steadily, customers will want complex policies. 

Here is the kicker, if SPS is inside namenode each time some feature is added 
we are going to step into this huge argument whether we should have these 
complex features inside namenode. 

So experience from Hortonworks customers tells me that we should prioritize 
scale and future needs of this feature rather than ease of code change for 
management tools. 


bq. IIUC, Main concerns to keep SPS (or any other such feature) in NameNode are 
following.

I think you missed a critical argument, all scan and move functions of HDFS 
today is outside Namenode. I am proposing that we keep it that way. SPS is not 
unique in any way, and we have a well-known pattern that works. In my mind,  
management tools like Ambari should be able to address the ease of use part. 
For people like me who are willing to use the shell, this does not seem to be 
an additional burden.

bq. 1. Locking Load 
    This same process can be done from outside namenode. Hence we are proposing 
that we move it outside.

bq. SPS should have client facing RPC server to learn about the paths to 
satisfy policy. This comes with lot of deployment overhead as already mentioned 
above by Andrew.
I  seriously question this assertion. From a shell perspective, we can check if 
this config value is set and start this daemon from start-dfs.sh. Why is this 
such a complicated task for Cloudera Manager or Ambari? I do not buy this 
argument. How can something that can be done in 5 lines of code in Hadoop, 
become a complex task that we would want to avoid that code path in Cloudera 
Manager? I am sorry, That makes no sense to me.

bq. if SPS doesnt have its own RPC server, then it needs to scan the targets by 
checking for xattr recursively from root( / ) directory

What prevents us from adding this? We should do what is technically required.

The problem I think you are missing is that the current SPS has no policy 
control of when it should run. But I posit that it is not too far off, that we 
will have to build various kinds of policies to control it. I am not suggesting 
that we need to do that before the merge. Being an independent service allows 
for this kind of flexibility.

bq. Memory

This is the most critical concern that I have. In one of the discussions with 
SPS developers, they pointed out to me that they want to make sure an SPS move 
happens within a reasonable time. Apparently, I was told that this is a 
requirement from HBase. If you have such a need, then the first thing an admin 
will do is to increase this queue size. Slowly, but steadily  SPS will eat into 
more and more memory of Namenode. In fact, you bring up a good point, if SPS is 
a service with its RPC and is an independent service, maybe this xattr that we 
are going to maintain inside Namenode can be moved out too. I am open to that 
suggestion. I was keying off the current design and did not reflect upon this 
too deeply. Thanks for bringing this to my notice.

bq. CPU

Agree, CPU is not a concern for me too.

bq. Impact of code
This is also not a concern that I have.

To summarize:

# I don't agree with the point that running SPS as an independent service is a 
complex task. My data point is the changes that I need to make in 
*start-dfs.sh*. I will be glad to post a patch that illustrates that it is not 
very complicated. 
# If the core reason for a putting a feature into namenode is less work for 
management tools, I submit that it is a wrong rationale.
# We have an existing pattern Balancer, Mover, DiskBalancer where we have the 
"scan and move tools" as an external feature to namenode. I am not able to see 
any convincing reason for breaking this pattern.
#  A feature like SPS in its current form is far from complete. There will be 
features that are added (as it should be) and SPS being an independent service 
will allow us to iterate more quickly. Most importantly, when we add feature or 
policy in SPS we will not need to have this discussion of how it is impacting 
Namenode.


 


was (Author: anu):
@[~andrew.wang] Thanks for the comments.
bq. Adding a new service requires adding support in management frameworks like 
Cloudera Manager or Ambari. This means support for deployment, configuration, 
monitoring, rolling upgrade, and log collection. 

I am not very familiar with these tools; I prefer to deploy my clusters without 
these tools. So help me here a bit, are you suggesting that we should decide if 
a feature should be inside Namenode or not, based on how inflexible these tools 
are? Why is it so hard for say, (I am just presuming that you will be more 
familiar with Cloudera Manager) Cloudera Manager to configure a new service? 
Isn't the sole purpose of these tools to do this kind of management actions? 

I am hopeful(again my understanding of these tools are minimal) that these 
tools already have all the requisite framework in place, and it is not as 
onerous as you describe to support a daemon that is running in the cluster.

IMHO, if we base the decisions on what feature should go into namenode based on 
the code modification complexity of these tools, I am worried that we are 
putting an unusually complex burden on Namenode.

I suggest that we should do what is the right thing for namenode based on the 
constraints of our layer and not bother about layers far above us.

@[~vinayrpet] Thank you for sharing your perspective. 

bq. Im coming at this from the standpoint of supporting Cloudera's Hadoop 
customers. 

Since I work for Hortonworks, I have a wealth of perspective on how customers 
tend to use these features. Most customers will start off with this tool as is, 
then they will discover that queue length is not adequate for the move to 
happen in a reasonable time, they will increase the queue length and then we  
will discover that Namenode is running out of memory. Next step, is that they 
will want us to run SPS based on various policies, like move is the blocks if 
the blocks are older than 3 hours, or if the load on Namenode is less than x, 
of the number of YARN containers in the cluster is less than X. 

Slowly but steadily, customers will want complex policies. 

Here is the kicker, if SPS is inside namenode each time some feature is added 
we are going to step into this huge argument whether we should have these 
complex features inside namenode. 

So experience from Hortonworks customers tells me that we should prioritize 
scale and future needs of this feature rather than ease of code change for 
management tools. 


bq. IIUC, Main concerns to keep SPS (or any other such feature) in NameNode are 
following.

I think you missed a critical argument, all scan and move functions of HDFS 
today is outside Namenode. I am proposing that we keep it that way. SPS is not 
unique in any way, and we have a well-known pattern that works. In my mind,  
management tools like Ambari should be able to address the ease of use part. 
For people like me who are willing to use the shell, this does not seem to be 
an additional burden.

bq. 1. Locking Load 
    This same process can be done from outside namenode. Hence we are proposing 
that we move it outside.

bq. SPS should have client facing RPC server to learn about the paths to 
satisfy policy. This comes with lot of deployment overhead as already mentioned 
above by Andrew.
I  seriously question this assertion. From a shell perspective, we can check if 
this config value is set and start this daemon from start-dfs.sh. Why is this 
such a complicated task for Cloudera Manager or Ambari? I do not buy this 
argument. How can something that can be done in 5 lines of code in Hadoop, 
become a complex task that we would want to avoid that code path in Cloudera 
Manager? I am sorry, That makes no sense to me.

bq. if SPS doesnt have its own RPC server, then it needs to scan the targets by 
checking for xattr recursively from root( / ) directory

What prevents us from adding this? We should do what is technically required.

The problem I think you are missing is that the current SPS has no policy 
control of when it should run. But I posit that it is not too far off, that we 
will have to build various kinds of policies to control it. I am not suggesting 
that we need to do that before the merge. Being an independent service allows 
for this kind of flexibility.

bq. Memory

This is the most critical concern that I have. In one of the discussions with 
SPS developers, they pointed out to me that they want to make sure an SPS move 
happens within a reasonable time. Apparently, I was told that this is a 
requirement from HBase. If you have such a need, then the first thing an admin 
will do is to increase this queue size. Slowly, but steadily  SPS will eat into 
more and more memory of Namenode. In fact, you bring up a good point, if SPS is 
a service with its RPC and is an independent service, maybe this xattr that we 
are going to maintain inside Namenode can be moved out too. I am open to that 
suggestion. I was keying off the current design and did not reflect upon this 
too deeply. Thanks for bringing this to my notice.

bq. CPU

Agree, CPU is not a concern for me too.

bq. Impact of code
This is also not a concern that I have.

To summarize:

# I don't agree with the point that running SPS as an independent service is a 
complex task. My data point is the changes that I need to make in 
*start-dfs.sh*. I will be glad to post a patch that illustrates that it is not 
very complicated. 
# If the core reason for a putting a feature into namenode is less work for 
management tools, I submit that it is a wrong rationale.
# We have an existing pattern Balancer, Mover, DiskBalancer where we have the 
"scan and move tools" as an external feature to namenode. I am not able to see 
any convincing reason for breaking this pattern.
#  A feature like SPS in its current form is far from complete. There will be 
features that are added (as it should be) and SPS being an independent service 
will allow us to iterate more quickly. Most importantly, when we add feature or 
policy in SPS we will not need to have this discussion of how it is impacting 
Namenode.


 

> Storage Policy Satisfier in Namenode
> ------------------------------------
>
>                 Key: HDFS-10285
>                 URL: https://issues.apache.org/jira/browse/HDFS-10285
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>    Affects Versions: HDFS-10285
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

Reply via email to