[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506204#comment-14506204
 ] 

Inigo Goiri commented on YARN-3482:
---

I implemented an approximation to approach #2. The Node Manager has no admin 
interface so I cannot leverage it to push the change of available resources. 
What I did was to extend the NodeResourceMonitor to periodically read the 
yarn-site.xml and change the number of available resources in the NM. This is 
then sent to the RM which is able to update those values.

Anybody has better ideas on what other interfaces I could use to push these 
changes to the NM? The current approach works but it might be a little heavy as 
is periodically checking.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505476#comment-14505476
 ] 

Inigo Goiri commented on YARN-3482:
---

[~grey], the ultimate target of this task is to provide an interface for 
external applications to change the amount of available resources in a node.
A part of YARN-3332 targets a smarter way of calculating the amount of 
resources available to an NM, this can be somewhat related but I think this 
effort is still needed.

Anyway, thanks for the pointer as I'm targetting some of the sub-tasks 
described in that task.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Lei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505440#comment-14505440
 ] 

Lei Guo commented on YARN-3482:
---

What's the relationship between this and 3332? They should be considered 
together.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505420#comment-14505420
 ] 

Inigo Goiri commented on YARN-3482:
---

I agree, 2 is more distributed and fits better the model that we want to push.
I'll implement it today.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505396#comment-14505396
 ] 

Karthik Kambatla commented on YARN-3482:


I like 2 better. 

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505284#comment-14505284
 ] 

Inigo Goiri commented on YARN-3482:
---

[~kasha],  the administration interface already has a method called 
updateNodeResource which takes a map NodeId -> ResourceOption. So it basically 
has the functionality that we want but it doesn't through the NodeLabeling 
interface. The problem is that right now there's no way for the admin to 
trigger this. So we have two options:
1) Expose this interface to the RM which is a little bit heavy as we need to 
update the whole cluster.
2) Do what we planned (send on the heartbeat from the NM) and just leverage 
these mechanisms.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500894#comment-14500894
 ] 

Inigo Goiri commented on YARN-3482:
---

Hi Sunil G, yes, I'm talking about Total CPU and Total Memory.
Combining this with YARN-3481, we can estimate the load in the node that is not 
caused by the containers (external processes).
Right now, the server could be overloaded by HBase for example and we would be 
sending more load there.

As Karthik Kambatla mentions, this would be a very conservative scenario where 
the external processes have absolute priority.
This might be a desired behavior for some users but the proposal is to also add 
an interface to dynamically change the amount of available resources according 
to the behavior of the external processes.
Both approaches target the same problem and are complementary/orthogonal.

I understand this other approach of sending node utilization might be a little 
out of the scope of this JIRA but I could open a new one with this 
functionality.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500395#comment-14500395
 ] 

Sunil G commented on YARN-3482:
---

Hi [~elgoiri]
bq.  better to report the resources utilized by the machine.

Do you mean Total CPU, and Total Memory etc. 
Could you please elaborate how this can help in doing a better resource 
allotment. 

As I see, if affinity is not set in CPU, distribution will be more generic and 
it may not be so easy to derive from that.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500348#comment-14500348
 ] 

Inigo Goiri commented on YARN-3482:
---

To make it match "yarn.nodemanager.resource.cpu-vcores" and 
"yarn.nodemanager.resource.memory-mb", I'm calling it 
"yarn.nodemanager.resource.dynamic-availability".

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500287#comment-14500287
 ] 

Inigo Goiri commented on YARN-3482:
---

Yes, that one is good. My proposal for the third one was meaningless...
I'll go code this.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500277#comment-14500277
 ] 

Karthik Kambatla commented on YARN-3482:


For the third config, would something like 
yarn.nodemanager.dynamic-resource-availability=true/false be more descriptive? 

Admin interface (with a special command) sounds reasonable. 

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500223#comment-14500223
 ] 

Inigo Goiri commented on YARN-3482:
---

Makes sense. I think we should implement both and give the option to use one or 
the other. Proposal for the names of the variables? 
yarn.nodemanager.track-utilization.node=true/false
yarn.nodemanager.track-utilization.containers=true/false
yarn.nodemanager.resource=true/false

(The second one would be for YARN-3481.)

For the interface, the simplest thing is to edit 
yarn.nodemanager.resource.cpu-vcores and yarn.nodemanager.resource.memory-mb in 
yarn-site.xml. However, this implies modifying the XML periodically which is 
kind of dirty for this purpose. I guess the cleanest is using the admin 
interface, preferences?

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500097#comment-14500097
 ] 

Karthik Kambatla commented on YARN-3482:


bq. With this and the containers utilization, we can estimate the utilization 
of external processes.

True, but I fear that will be too conservative. If we go that route, HBase 
RegionServers could grow aggressively and adversely affect resources under 
Yarn. By having an interface for available resources, we ensure Yarn 
aggressively schedules work to claim all available resources. Changing these 
available resources could be through a secure interface admins or a white-list 
of processes can access. 

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-16 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498926#comment-14498926
 ] 

Inigo Goiri commented on YARN-3482:
---

After some testing, I figured that it might be better to report the resources 
utilized by the machine.
With this and the containers utilization, we can estimate the utilization of 
external processes.
In this way, there's no need for an external interface and the scheduler could 
take the right decisions using the node utilization.
Thoughts?

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-13 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493177#comment-14493177
 ] 

Inigo Goiri commented on YARN-3482:
---

The NM would have an interface (maybe a configuration file) to change how many 
resources are available. The NM would send this information in the NodeStatus 
to the RM in every heartbeat as is done at registration time. This information 
would be exposed through the Web UI as is now.

This will have some implications when the available resources go lower than the 
allocated capacity.

> Report NM available resources in heartbeat
> --
>
> Key: YARN-3482
> URL: https://issues.apache.org/jira/browse/YARN-3482
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> NMs are usually collocated with other processes like HDFS, Impala or HBase. 
> To manage this scenario correctly, YARN should be aware of the actual 
> available resources. The proposal is to have an interface to dynamically 
> change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)