[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2017-10-08 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196264#comment-16196264
 ] 

Yufei Gu commented on YARN-3332:


Is this done by ATSv2? cc [~haibo.chen]

> [Umbrella] Unified Resource Statistics Collection per node
> --
>
> Key: YARN-3332
> URL: https://issues.apache.org/jira/browse/YARN-3332
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: Design - UnifiedResourceStatisticsCollection.pdf
>
>
> Today in YARN, NodeManager collects statistics like per container resource 
> usage and overall physical resources available on the machine. Currently this 
> is used internally in YARN by the NodeManager for only a limited usage: 
> automatically determining the capacity of resources on node and enforcing 
> memory usage to what is reserved per container.
> This proposal is to extend the existing architecture and collect statistics 
> for usage b​eyond​ the existing use­cases.
> Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-06-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605966#comment-14605966
 ] 

Allen Wittenauer commented on YARN-3332:


Why is this a YARN JIRA and not in HADOOP?

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-04-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521926#comment-14521926
 ] 

Karthik Kambatla commented on YARN-3332:


[~vinodkv] - did you start implementing this? I would like to be involved in 
the work here - either implementing parts of it or reviewing most of it. 

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-04-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14522774#comment-14522774
 ] 

Vinod Kumar Vavilapalli commented on YARN-3332:
---

Unfortunately, other pieces starting moving in sooner than I could start on 
this: YARN-3534 (in progress), YARN-3334 (part of Timeline service next-gen 
YARN-2928). So I am planning to do a refactor once those two go into trunk.

Tx for offering involvement, once they go in, I can file sub-tasks for moving 
forward.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356418#comment-14356418
 ] 

Karthik Kambatla commented on YARN-3332:


bq. the machine level big picture is fragmented between YARN and HDFS (and 
HBase etc)
What constitutes the machine level big picture? Isn't this just the overall 
node's resource usage? YARN, at least as of today, doesn't need to know about 
the usage stats of HDFS or HBase. 

I have nothing against going the server route, except the additional daemon one 
might end up having to run.

bq. I anyways needed a service to expose an API for both admins/users as well 
as external systems beyond HDFS too - I can imagine tools being built on top of 
this.
It is not as clear to me. Let us say an admin and a user want usage stats about 
their YARN containers. The service can only provide the usage stats, while YARN 
will be able to provide other container metadata. Also, we should consider 
privacy of usage information. Will auth against this new service be additional 
overhead? 

bq. That said, it doesn't need to be service or library. I can think of a 
library that wires into the exposed API, though I haven't found uses for that 
yet.
Sorry, didn't get that. Can you clarify/ elaborate? 

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356260#comment-14356260
 ] 

Vinod Kumar Vavilapalli commented on YARN-3332:
---

Chose the service model because the machine level big picture is fragmented 
between YARN and HDFS (and HBase etc) - having a lower level common statistics 
layer is useful.

I anyways needed a service to expose an API for both admins/users as well as 
external systems beyond HDFS too - I can imagine tools being built on top of 
this.

That said, it doesn't need to be service or library. I can think of a library 
that wires into the exposed API, though I haven't found uses for that yet.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356261#comment-14356261
 ] 

Vinod Kumar Vavilapalli commented on YARN-3332:
---

Agreed, this should be entirely possible.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356309#comment-14356309
 ] 

Zhijie Shen commented on YARN-3332:
---

It sounds a great proposal, thanks Vinod! I quick thought about the publishing 
channel of the collected statistics. I'm not sure how different the access 
pattern would be, but just thinking it out loudly, is it possible reuse the 
timeline service to distribute the node statistics, getting rid of maintaining 
different but similar interfaces (or multiple data flow channels). On step 
further, we can make the timeline service the main bus to transmit metrics from 
A to B.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355583#comment-14355583
 ] 

Vinod Kumar Vavilapalli commented on YARN-3332:
---

Linking related tickets that can leverage this: YARN-2928, YARN-2745.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355798#comment-14355798
 ] 

Karthik Kambatla commented on YARN-3332:


Thanks for filing this and working on the design, Vinod. I like the idea of a 
clean interface to get node and container resource usage info. 

Is there any reason why you think a service architecture is better than it 
being a common library? How much information is shared among the consumers of 
this interface? For instance, both HDFS and YARN would be interested in the 
availability and usage of CPU, memory, disk and network for the entire node. 
Isn't all other information of exclusive interest either? 

Have other questions/comments on the design, but will hold off until we decide 
on service vs library. 

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Lei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355891#comment-14355891
 ] 

Lei Guo commented on YARN-3332:
---

Any consideration to support plug-in for customized resource statistics 
collection in NM? We may need other type resource information for scheduling 
purpose later, for example, GPU related information. 

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355901#comment-14355901
 ] 

Li Lu commented on YARN-3332:
-

Hi [~grey], I think it's a nice idea. I think after YARN-2928, the timeline 
service layer would support this kind of usage (we're supporting metrics as a 
generic concept). What we need to do under this JIRA is to make the interface 
available on the NM level, I think? 

BTW, it would be cool to have GPU metrics. But I'm not sure if there are any 
general ways to gather this information. Would be helpful if you could 
elaborate a little bit more (if that's related to this JIRA). Thanks! 

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node

2015-03-10 Thread Lei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355923#comment-14355923
 ] 

Lei Guo commented on YARN-3332:
---

To support customized resources, a quick list about areas we need consider
- resource definition, how NM/RM to understand the resource, this should be 
considered as Metrics based
- plug-in framework in NM/agent, 
   * interface for passing resource information between the plug-in and agent, 
this could be another RPC interface, so the plug-in can be based on any language
   * interface for loading/trigger plug-in (optional), the reason this 
interface as optional because the plug-in could be easy as cron job
- Sample resource collection plug-in for specific resource (or resource set), 
this could be some script or Java class depending on the plug-in framework 
design
- communication protocol between RM/NM to support customized resource

This topic is related to our proposal in June Hadoop Summit on multiple 
dimension scheduling.

 [Umbrella] Unified Resource Statistics Collection per node
 --

 Key: YARN-3332
 URL: https://issues.apache.org/jira/browse/YARN-3332
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: Design - UnifiedResourceStatisticsCollection.pdf


 Today in YARN, NodeManager collects statistics like per container resource 
 usage and overall physical resources available on the machine. Currently this 
 is used internally in YARN by the NodeManager for only a limited usage: 
 automatically determining the capacity of resources on node and enforcing 
 memory usage to what is reserved per container.
 This proposal is to extend the existing architecture and collect statistics 
 for usage b​eyond​ the existing use­cases.
 Proposal attached in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)