[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-07 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-003.patch

Created {{ZKManager}}. Let me know what are the thought on the refactoring.

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-002.patch, YARN-6900-003.patch, 
> YARN-6900-YARN-2915-000.patch, YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-002.patch

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-002.patch, YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-002.patch

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: (was: YARN-6900-002.patch)

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: (was: YARN-6900-002.patch)

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-002.patch

* Rebase to trunk
* Refactored ZK utils

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-002.patch, YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6840) Implement zookeeper based store for scheduler configuration updates

2017-08-02 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111717#comment-16111717
 ] 

Inigo Goiri commented on YARN-6840:
---

[~wangda], we are already using curator for HDFS-10631 and YARN-6900.
Curator already wraps some of the common functionality.
There might be some functions that can be made into a utility but not sure if 
there is too much value.

> Implement zookeeper based store for scheduler configuration updates
> ---
>
> Key: YARN-6840
> URL: https://issues.apache.org/jira/browse/YARN-6840
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Jonathan Hung
>
> Right now there is only in-memory and leveldb based configuration store 
> supported. Need one which supports RM HA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-01 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-YARN-2915-001.patch

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-01 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: (was: YARN-6900-YARN-2915-001.patch)

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-08-01 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-YARN-2915-001.patch

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch, 
> YARN-6900-YARN-2915-001.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6900) ZooKeeper based implementation of the FederationStateStore

2017-07-28 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6900:
--
Attachment: YARN-6900-YARN-2915-000.patch

Proposal using curator.

> ZooKeeper based implementation of the FederationStateStore
> --
>
> Key: YARN-6900
> URL: https://issues.apache.org/jira/browse/YARN-6900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Inigo Goiri
> Attachments: YARN-6900-YARN-2915-000.patch
>
>
> YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only 
> support SQL based stores, this JIRA tracks adding a ZooKeeper based 
> implementation for simplifying deployment as it's already popularly used for 
> {{RMStateStore}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5396) YARN large file broadcast service

2017-07-07 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078617#comment-16078617
 ] 

Inigo Goiri commented on YARN-5396:
---

We are also interested on this and we may be able to add resources for testing, 
etc.
[~mingma], can you add a pointer to the Spark bittorrent broadcasting?

> YARN large file broadcast service
> -
>
> Key: YARN-5396
> URL: https://issues.apache.org/jira/browse/YARN-5396
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: slides-prototype.pdf, YARN-broadcast-prototype.patch, 
> YARNFileTransferService-prototype.pdf
>
>
> In Hadoop and related softwares, there are demands of broadcasting large 
> files. For example, YARN application may localize large jar files on each 
> node; Hive may distribute large tables in fragment-replicate joins; docker 
> integration may broadcast large container image. The current local resource 
> based solution is to put the files on HDFS and let each node download from 
> HDFS, which is inefficient and not scalable. So we want to build a better 
> file transfer service in YARN so that all applications can use it broadcast 
> large file efficiently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6633) Backport YARN-4167 to branch 2.7

2017-05-22 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6633:
--
Attachment: YARN-4167-branch-2.7.patch

> Backport YARN-4167 to branch 2.7
> 
>
> Key: YARN-6633
> URL: https://issues.apache.org/jira/browse/YARN-6633
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: YARN-4167-branch-2.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6633) Backport YARN-4167 to branch 2.7

2017-05-22 Thread Inigo Goiri (JIRA)
Inigo Goiri created YARN-6633:
-

 Summary: Backport YARN-4167 to branch 2.7
 Key: YARN-6633
 URL: https://issues.apache.org/jira/browse/YARN-6633
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Inigo Goiri
Assignee: Inigo Goiri
 Attachments: YARN-4167-branch-2.7.patch





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6632) Backport YARN-3425 to branch 2.7

2017-05-22 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-6632:
--
Attachment: YARN-3425-branch-2.7.patch

> Backport YARN-3425 to branch 2.7
> 
>
> Key: YARN-6632
> URL: https://issues.apache.org/jira/browse/YARN-6632
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: YARN-3425-branch-2.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6632) Backport YARN-3425 to branch 2.7

2017-05-22 Thread Inigo Goiri (JIRA)
Inigo Goiri created YARN-6632:
-

 Summary: Backport YARN-3425 to branch 2.7
 Key: YARN-6632
 URL: https://issues.apache.org/jira/browse/YARN-6632
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Inigo Goiri
Assignee: Inigo Goiri
 Attachments: YARN-3425-branch-2.7.patch





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4330) MiniYARNCluster is showing multiple Failed to instantiate default resource calculator warning messages.

2016-11-14 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664824#comment-15664824
 ] 

Inigo Goiri commented on YARN-4330:
---

After YARN-5356, I think that {{NodeManagerHardwareUtils}} should use 
{{getNodeResourceMonitorPlugin()}} to get the {{ResourceCalculatorPlugin}}.

> MiniYARNCluster is showing multiple  Failed to instantiate default resource 
> calculator warning messages.
> 
>
> Key: YARN-4330
> URL: https://issues.apache.org/jira/browse/YARN-4330
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test, yarn
>Affects Versions: 2.8.0
> Environment: OSX, JUnit
>Reporter: Steve Loughran
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: oct16-hard
> Attachments: YARN-4330.002.patch, YARN-4330.003.patch, 
> YARN-4330.01.patch
>
>
> Whenever I try to start a MiniYARNCluster on Branch-2 (commit #0b61cca), I 
> see multiple stack traces warning me that a resource calculator plugin could 
> not be created
> {code}
> (ResourceCalculatorPlugin.java:getResourceCalculatorPlugin(184)) - 
> java.lang.UnsupportedOperationException: Could not determine OS: Failed to 
> instantiate default resource calculator.
> java.lang.UnsupportedOperationException: Could not determine OS
> {code}
> This is a minicluster. It doesn't need resource calculation. It certainly 
> doesn't need test logs being cluttered with even more stack traces which will 
> only generate false alarms about tests failing. 
> There needs to be a way to turn this off, and the minicluster should have it 
> that way by default.
> Being ruthless and marking as a blocker, because its a fairly major 
> regression for anyone testing with the minicluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-07 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.011.patch

Fixing ResourceCalculatorPlugin configs.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch, YARN-5356.009.patch, 
> YARN-5356.010.patch, YARN-5356.011.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-04 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637216#comment-15637216
 ] 

Inigo Goiri commented on YARN-5356:
---

The failed unit test doesn't seem related to the patch.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch, YARN-5356.009.patch, 
> YARN-5356.010.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-04 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.010.patch

Missing import.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch, YARN-5356.009.patch, 
> YARN-5356.010.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.009.patch

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch, YARN-5356.009.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.008.patch

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: (was: YARN-5356.008.patch)

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.008.patch

Setting ResourceCalculatorPlugin consistently for the NM and removing 
read/write lock.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch, YARN-5356.008.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-11-03 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15632983#comment-15632983
 ] 

Inigo Goiri commented on YARN-5356:
---

Thanks [~jlowe] for the review.

Regarding the {{ResourceCalculatorPlugin}}, I'm very tempted to create a method 
{{ResourceCalculatorPlugin#getNMResourceCalculatorPlugin}} to encapsulate the 
code for RCP creation in {{ContainersMonitorImpl}} and make it easier to 
instantiate.

Regarding the locks, the fewer locks the better, as it's just an assignment, 
volatile should be more than enough.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-10-28 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.007.patch

Fixing rounding issue.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
>  Labels: oct16-medium
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch, 
> YARN-5356.007.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5259) Add two metrics at FSOpDurations for doing container assign and completed Performance statistical analysis

2016-10-27 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613649#comment-15613649
 ] 

Inigo Goiri commented on YARN-5259:
---

[~templedf], I checked and there are no other unit tests for related to 
{{FSOpDurations}} metrics.
We would have to add tests for that; the only related is 
{{testPerfMetricsInited()#TestFairScheduler}}.

> Add two metrics at FSOpDurations for doing container assign and completed 
> Performance statistical analysis
> --
>
> Key: YARN-5259
> URL: https://issues.apache.org/jira/browse/YARN-5259
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: ChenFolin
>  Labels: oct16-easy
> Attachments: YARN-5259-001.patch, YARN-5259-002.patch, 
> YARN-5259-003.patch, YARN-5259-004.patch
>
>
> If cluster is slow , we can not know Whether it is caused by container assign 
> or completed performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5259) Add two metrics at FSOpDurations for doing container assign and completed Performance statistical analysis

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5259:
--
Attachment: YARN-5259-004.patch

Rebased to trunk and minor fixes.

> Add two metrics at FSOpDurations for doing container assign and completed 
> Performance statistical analysis
> --
>
> Key: YARN-5259
> URL: https://issues.apache.org/jira/browse/YARN-5259
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: ChenFolin
>  Labels: oct16-easy
> Attachments: YARN-5259-001.patch, YARN-5259-002.patch, 
> YARN-5259-003.patch, YARN-5259-004.patch
>
>
> If cluster is slow , we can not know Whether it is caused by container assign 
> or completed performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5259) Add two metrics at FSOpDurations for doing container assign and completed Performance statistical analysis

2016-10-27 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613521#comment-15613521
 ] 

Inigo Goiri commented on YARN-5259:
---

[~chenfolin], I'm rebasing this and fixing the {{monotonicNow()}}.

> Add two metrics at FSOpDurations for doing container assign and completed 
> Performance statistical analysis
> --
>
> Key: YARN-5259
> URL: https://issues.apache.org/jira/browse/YARN-5259
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: ChenFolin
>  Labels: oct16-easy
> Attachments: YARN-5259-001.patch, YARN-5259-002.patch, 
> YARN-5259-003.patch
>
>
> If cluster is slow , we can not know Whether it is caused by container assign 
> or completed performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5259) Add two metrics at FSOpDurations for doing container assign and completed Performance statistical analysis

2016-10-27 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613115#comment-15613115
 ] 

Inigo Goiri commented on YARN-5259:
---

If you want to push it today, I can rebase and fix the typo in the annotation.

> Add two metrics at FSOpDurations for doing container assign and completed 
> Performance statistical analysis
> --
>
> Key: YARN-5259
> URL: https://issues.apache.org/jira/browse/YARN-5259
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: ChenFolin
>  Labels: oct16-easy
> Attachments: YARN-5259-001.patch, YARN-5259-002.patch, 
> YARN-5259-003.patch
>
>
> If cluster is slow , we can not know Whether it is caused by container assign 
> or completed performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5215) Scheduling containers based on external load in the servers

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5215:
--
Attachment: YARN-5215.002.patch

Rebased to trunk.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
>  Labels: oct16-hard
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch, 
> YARN-5215.002.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: YARN-2965.003.patch

Rebasing to latest trunk.

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
>  Labels: oct16-hard
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, 
> YARN-2965.002.patch, YARN-2965.003.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5215) Scheduling containers based on external load in the servers

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri reassigned YARN-5215:
-

Assignee: Inigo Goiri

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
>  Labels: oct16-hard
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5548) Random test failure TestRMRestart#testFinishedAppRemovalAfterRMRestart

2016-10-27 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612878#comment-15612878
 ] 

Inigo Goiri commented on YARN-5548:
---

It may solve this issue.

> Random test failure TestRMRestart#testFinishedAppRemovalAfterRMRestart
> --
>
> Key: YARN-5548
> URL: https://issues.apache.org/jira/browse/YARN-5548
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-easy
> Attachments: YARN-5548.0001.patch, YARN-5548.0002.patch, 
> YARN-5548.0003.patch
>
>
> https://builds.apache.org/job/PreCommit-YARN-Build/12850/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testFinishedAppRemovalAfterRMRestart/
> {noformat}
> Error Message
> Stacktrace
> java.lang.AssertionError: expected null, but was: application_submission_context { application_id { id: 1 cluster_timestamp: 
> 1471885197388 } application_name: "" queue: "default" priority { priority: 0 
> } am_container_spec { } cancel_tokens_when_complete: true maxAppAttempts: 2 
> resource { memory: 1024 virtual_cores: 1 } applicationType: "YARN" 
> keep_containers_across_application_attempts: false 
> attempt_failures_validity_interval: 0 am_container_resource_request { 
> priority { priority: 0 } resource_name: "*" capability { memory: 1024 
> virtual_cores: 1 } num_containers: 0 relax_locality: true 
> node_label_expression: "" execution_type_request { execution_type: GUARANTEED 
> enforce_execution_type: false } } } user: "jenkins" start_time: 1471885197417 
> application_state: RMAPP_FINISHED finish_time: 1471885197478>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotNull(Assert.java:664)
>   at org.junit.Assert.assertNull(Assert.java:646)
>   at org.junit.Assert.assertNull(Assert.java:656)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testFinishedAppRemovalAfterRMRestart(TestRMRestart.java:1656)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5428) Allow for specifying the docker client configuration directory

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5428:
--
Labels: oct16-medium  (was: )

> Allow for specifying the docker client configuration directory
> --
>
> Key: YARN-5428
> URL: https://issues.apache.org/jira/browse/YARN-5428
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>  Labels: oct16-medium
> Attachments: YARN-5428.001.patch, YARN-5428.002.patch, 
> YARN-5428.003.patch, YARN-5428.004.patch, 
> YARN-5428Allowforspecifyingthedockerclientconfigurationdirectory.pdf
>
>
> The docker client allows for specifying a configuration directory that 
> contains the docker client's configuration. It is common to store "docker 
> login" credentials in this config, to avoid the need to docker login on each 
> cluster member. 
> By default the docker client config is $HOME/.docker/config.json on Linux. 
> However, this does not work with the current container executor user 
> switching and it may also be desirable to centralize this configuration 
> beyond the single user's home directory.
> Note that the command line arg is for the configuration directory NOT the 
> configuration file.
> This change will be needed to allow YARN to automatically pull images at 
> localization time or within container executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5366) Add support for toggling the removal of completed and failed docker containers

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5366:
--
Labels: oct16-medium  (was: )

> Add support for toggling the removal of completed and failed docker containers
> --
>
> Key: YARN-5366
> URL: https://issues.apache.org/jira/browse/YARN-5366
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>  Labels: oct16-medium
> Attachments: YARN-5366.001.patch, YARN-5366.002.patch, 
> YARN-5366.003.patch, YARN-5366.004.patch, YARN-5366.005.patch, 
> YARN-5366.006.patch
>
>
> Currently, completed and failed docker containers are removed by 
> container-executor. Add a job level environment variable to 
> DockerLinuxContainerRuntime to allow the user to toggle whether they want the 
> container deleted or not and remove the logic from container-executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5268) DShell AM fails java.lang.InterruptedException

2016-10-27 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612754#comment-15612754
 ] 

Inigo Goiri commented on YARN-5268:
---

I would make the comment in {{shouldExit()}} as a javadoc for the private 
method.

> DShell AM fails java.lang.InterruptedException
> --
>
> Key: YARN-5268
> URL: https://issues.apache.org/jira/browse/YARN-5268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Tan, Wangda
>Priority: Critical
>  Labels: oct16-easy
> Attachments: YARN-5268.1.patch
>
>
> Distributed Shell AM failed with the following error
> {Code}
> 16/06/16 11:08:10 INFO impl.NMClientAsyncImpl: NMClient stopped.
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application 
> completed. Signalling finish to RM
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Diagnostics., 
> total=16, completed=19, allocated=21, failed=4
> 16/06/16 11:08:10 INFO impl.AMRMClientImpl: Waiting for application to be 
> successfully unregistered.
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application Master 
> failed. exiting
> 16/06/16 11:08:10 INFO impl.AMRMClientAsyncImpl: Interrupted while waiting 
> for queue
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287)
> End of LogType:AppMaster.stderr
> {Code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5548) Random test failure TestRMRestart#testFinishedAppRemovalAfterRMRestart

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5548:
--
Labels: oct16-easy  (was: )

> Random test failure TestRMRestart#testFinishedAppRemovalAfterRMRestart
> --
>
> Key: YARN-5548
> URL: https://issues.apache.org/jira/browse/YARN-5548
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-easy
> Attachments: YARN-5548.0001.patch, YARN-5548.0002.patch, 
> YARN-5548.0003.patch
>
>
> https://builds.apache.org/job/PreCommit-YARN-Build/12850/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testFinishedAppRemovalAfterRMRestart/
> {noformat}
> Error Message
> Stacktrace
> java.lang.AssertionError: expected null, but was: application_submission_context { application_id { id: 1 cluster_timestamp: 
> 1471885197388 } application_name: "" queue: "default" priority { priority: 0 
> } am_container_spec { } cancel_tokens_when_complete: true maxAppAttempts: 2 
> resource { memory: 1024 virtual_cores: 1 } applicationType: "YARN" 
> keep_containers_across_application_attempts: false 
> attempt_failures_validity_interval: 0 am_container_resource_request { 
> priority { priority: 0 } resource_name: "*" capability { memory: 1024 
> virtual_cores: 1 } num_containers: 0 relax_locality: true 
> node_label_expression: "" execution_type_request { execution_type: GUARANTEED 
> enforce_execution_type: false } } } user: "jenkins" start_time: 1471885197417 
> application_state: RMAPP_FINISHED finish_time: 1471885197478>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotNull(Assert.java:664)
>   at org.junit.Assert.assertNull(Assert.java:646)
>   at org.junit.Assert.assertNull(Assert.java:656)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testFinishedAppRemovalAfterRMRestart(TestRMRestart.java:1656)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5268) DShell AM fails java.lang.InterruptedException

2016-10-27 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5268:
--
Labels: oct16-easy  (was: )

> DShell AM fails java.lang.InterruptedException
> --
>
> Key: YARN-5268
> URL: https://issues.apache.org/jira/browse/YARN-5268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Tan, Wangda
>Priority: Critical
>  Labels: oct16-easy
> Attachments: YARN-5268.1.patch
>
>
> Distributed Shell AM failed with the following error
> {Code}
> 16/06/16 11:08:10 INFO impl.NMClientAsyncImpl: NMClient stopped.
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application 
> completed. Signalling finish to RM
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Diagnostics., 
> total=16, completed=19, allocated=21, failed=4
> 16/06/16 11:08:10 INFO impl.AMRMClientImpl: Waiting for application to be 
> successfully unregistered.
> 16/06/16 11:08:10 INFO distributedshell.ApplicationMaster: Application Master 
> failed. exiting
> 16/06/16 11:08:10 INFO impl.AMRMClientAsyncImpl: Interrupted while waiting 
> for queue
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287)
> End of LogType:AppMaster.stderr
> {Code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-10-20 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.006.patch

Using configured capacity when physical is not available.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-10-20 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592776#comment-15592776
 ] 

Inigo Goiri commented on YARN-5356:
---

OK, that makes sense too. Let me switch to that.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-10-20 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.005.patch

Fixing NullPointerException when resource calculator not available.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch, YARN-5356.005.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-10-20 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592467#comment-15592467
 ] 

Inigo Goiri commented on YARN-5356:
---

Makes sense. Not sure what to do in this case though:
# Leave physicalResource as null
# Set physicalResource to no resources
I would go for the first one but I'm open to suggestions.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-09-20 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.004.patch

Fixing compilation.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, 
> YARN-5356.004.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-09-19 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.003.patch

Fixing NPE in PB.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-09-19 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.002.patch

Fixing NPE in PB for physical resources.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch, YARN-5356.002.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-09-19 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505089#comment-15505089
 ] 

Inigo Goiri commented on YARN-5356:
---

True, this is not done properly. I need to add a unit test for the PB too. I'll 
post a patch soon.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5473) Expose over-allocation in the Resource Manager UI

2016-08-04 Thread Inigo Goiri (JIRA)
Inigo Goiri created YARN-5473:
-

 Summary: Expose over-allocation in the Resource Manager UI
 Key: YARN-5473
 URL: https://issues.apache.org/jira/browse/YARN-5473
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Inigo Goiri


When enabling over-allocation of nodes, the resources in the cluster change. We 
need to surface this information for users to understand these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-22 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390204#comment-15390204
 ] 

Inigo Goiri commented on YARN-2664:
---

Other than the javadoc issue which I'm not sure how to fix and the unrelated 
unit test that fial, I think latest patch is ready for review.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.14.patch, YARN-2664.15.patch, 
> YARN-2664.16.patch, YARN-2664.17.patch, YARN-2664.2.patch, YARN-2664.3.patch, 
> YARN-2664.4.patch, YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, 
> YARN-2664.8.patch, YARN-2664.9.patch, YARN-2664.patch, legal.patch, 
> screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-22 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.17.patch

Fixed unit test.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.14.patch, YARN-2664.15.patch, 
> YARN-2664.16.patch, YARN-2664.17.patch, YARN-2664.2.patch, YARN-2664.3.patch, 
> YARN-2664.4.patch, YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, 
> YARN-2664.8.patch, YARN-2664.9.patch, YARN-2664.patch, legal.patch, 
> screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-21 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.16.patch

Fixing findbugs. I don't think I can solve the rest. 

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.14.patch, YARN-2664.15.patch, 
> YARN-2664.16.patch, YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, 
> YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, 
> YARN-2664.9.patch, YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-21 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.15.patch

Went too crazy with the finals.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.14.patch, YARN-2664.15.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, 
> YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-21 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.14.patch

Jenkins fixes.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.14.patch, YARN-2664.2.patch, YARN-2664.3.patch, 
> YARN-2664.4.patch, YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, 
> YARN-2664.8.patch, YARN-2664.9.patch, YARN-2664.patch, legal.patch, 
> screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-20 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.13.patch

Fixing checkstyles and jenkins errors.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.13.patch, YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, 
> YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, 
> YARN-2664.9.patch, YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-07-18 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383189#comment-15383189
 ] 

Inigo Goiri commented on YARN-5356:
---

I agree on getting more people to check this approach, [~kasha], [~wangda]?

I thought about the unit test a little and I think we could collect some fake 
physical resources and then check that they make it into the RM.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-07-18 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.002.patch

Fixed compilation issue and checkstyle.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch, 
> YARN-5356.002.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager

2016-07-18 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.001.patch

Renaming to physical resources and adding to RMNode.

> NodeManager should communicate physical resource capability to ResourceManager
> --
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch, YARN-5356.001.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if the NM also communicated the 
> actual physical resource capabilities of the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-14 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378465#comment-15378465
 ] 

Inigo Goiri commented on YARN-2965:
---

The failed unit test seems unrelated and the checkstyle issues are related to 
private accesses. Anybody available for review?

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, 
> YARN-2965.002.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-12 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5356:
--
Attachment: YARN-5356.000.patch

First proposal for sending physical resources in the node to the RM.

> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-12 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri reassigned YARN-5356:
-

Assignee: Inigo Goiri

> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Inigo Goiri
> Attachments: YARN-5356.000.patch
>
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-12 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373830#comment-15373830
 ] 

Inigo Goiri commented on YARN-5356:
---

It looks like there is some work we may want to leverage in YARN-4081 where 
they have YARN-4081 to add multiple resources.
I'd like to have some feedback from [~vvasudev] and [~asuresh] about their 
opinion on this.

> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-12 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373186#comment-15373186
 ] 

Inigo Goiri commented on YARN-5356:
---

[~nroberts], I think that changing machine resources it's not that common and 
admins could always restart the Node Manager. in that case, I would just extend 
{{RegisterNodeManagerRequestProto}} in 
{{yarn_server_common_service_protos.proto}} and populate it in 
{{NodeStatusUpdaterImpl#registerWithRM()}}. If we don't care about reporting 
network, then adding {{Resource}} is fine. However, if we go into network and 
disk, {{ResourceUtilization}} has those fields but I don't think that the 
semantics we want to provide matches with resource utilization.

I can post a patch with these changes if you want.

> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5179) Issue of CPU usage of containers

2016-07-11 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371957#comment-15371957
 ] 

Inigo Goiri commented on YARN-5179:
---

[~asuresh], given that YARN-5117 acknowledges that there is a problem with the 
way we calculated the CPU usage, I agree with [~Zhongkai] that we should 
revisit the way that milliVCores is computed in {{ContainersMonitorImpl}}. 
[~Zhongkai], could you upload a patch with your proposal to see if it makes 
sense?

> Issue of CPU usage of containers
> 
>
> Key: YARN-5179
> URL: https://issues.apache.org/jira/browse/YARN-5179
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
> Environment: Both on Windows and Linux
>Reporter: Zhongkai Mi
>
> // Multiply by 1000 to avoid losing data when converting to int 
>int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000 
>   * maxVCoresAllottedForContainers /nodeCpuPercentageForYARN); 
> This formula will not get right CPU usage based vcore if vcores != physical 
> cores. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-11 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371947#comment-15371947
 ] 

Inigo Goiri edited comment on YARN-5356 at 7/12/16 12:15 AM:
-

In general, we have 3 values:
* Actual resources of the full machine. This currently comes from 
{{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 
cores.
* Resource available for the Node Manager. This is currently defined in 
yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the 
{{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the 
{{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 
400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach 
you know that you have 6 cores available to the NM and 4 of them are used. 
However, the machine is not that utilized (~30%). Correct? In that case, we 
would only need to report the actual size of the machine at registration time 
as it would never change. Not sure that {{ResourceUtilization}} would be the 
right place for that as it would be reported in every heartbeat continuously.



was (Author: elgoiri):
In general, we have 3 values:
* Actual resources of the full machine. This currently comes from 
{{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 
cores for example
* Resource available for the Node Manager. This is currently defined in 
yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the 
{{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the 
{{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 
400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach 
you know that you have 6 cores available to the NM and 4 of them are used. 
However, the machine is not that utilized (~30%). Correct? In that case, we 
would only need to report the actual size of the machine at registration time 
as it would never change. Not sure that {{ResourceUtilization}} would be the 
right place for that as it would be reported in every heartbeat continuously.


> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5356) ResourceUtilization should also include resource availability

2016-07-11 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371947#comment-15371947
 ] 

Inigo Goiri commented on YARN-5356:
---

In general, we have 3 values:
* Actual resources of the full machine. This currently comes from 
{{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 
cores for example
* Resource available for the Node Manager. This is currently defined in 
yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the 
{{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the 
{{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 
400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach 
you know that you have 6 cores available to the NM and 4 of them are used. 
However, the machine is not that utilized (~30%). Correct? In that case, we 
would only need to report the actual size of the machine at registration time 
as it would never change. Not sure that {{ResourceUtilization}} would be the 
right place for that as it would be reported in every heartbeat continuously.


> ResourceUtilization should also include resource availability
> -
>
> Key: YARN-5356
> URL: https://issues.apache.org/jira/browse/YARN-5356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: YARN-2965.002.patch

Compilation issue.

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, 
> YARN-2965.002.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: (was: YARN-2965.002.patch)

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: YARN-2965.002.patch

Fixed checkstyle and unit tests.

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, 
> YARN-2965.002.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: YARN-2965.001.patch

Fixing Jenkins issues and adding ResourceTimeTracker for resources.

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2965:
--
Attachment: YARN-2965.000.patch

First proposal to report disk and network utilizations to the RM. Missing Linux 
bits. Thinking on refactoring CpuTimeTracker to make it generic for disk and 
network utilizations.

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: YARN-2965.000.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-07-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri reassigned YARN-2965:
-

Assignee: Inigo Goiri  (was: Robert Grandl)

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
> Attachments: ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-07-06 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365512#comment-15365512
 ] 

Inigo Goiri commented on YARN-5215:
---

[~kasha], for the node utilization, in Windows it's pretty fast and I haven't 
found any issues.
My guess is that in Linux it should be pretty fast too as it's checking single 
values in /proc and not going through the whole tree.

Regarding disk and network, the node monitoring is already in trunk for both 
Windows and Linux.
The utilization it's not sent to the RM but I have that pending in YARN-2965. I 
can push for that this week.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-07-06 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365259#comment-15365259
 ] 

Inigo Goiri commented on YARN-5215:
---

In our internal deployment, we always reserve a buffer for the external load to 
spike. This is set by tuning the available cores and memory.

[~jlowe], as you mention, we internally have preemption at both RM and NM 
level. We only enable the one at NM level as it's the one with the best latency 
and we don't have a need for the RM level one. As I mention in a previous 
comment, this patch it's just to do scheduling in the RM, if we want to go with 
the full solution, we would need:
* Schedule containers considering external load in the RM
* Expose external load in the UI
* Use history to smooth external load
* Preempting containers from the RM based on external load
* Preempting containers from the NM based on external load

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2016-07-01 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-2664:
--
Attachment: YARN-2664.12.patch

Rebased patch.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Inigo Goiri
>  Labels: BB2015-05-TBR
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.11.patch, YARN-2664.12.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, 
> YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-22 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345539#comment-15345539
 ] 

Inigo Goiri commented on YARN-5171:
---

[~asuresh], other than the unused import, everything else (including the 
unrelated unit test failure) look good.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch, YARN-5171.009.patch, YARN-5171.010.patch, 
> YARN-5171.011.patch, YARN-5171.012.patch, YARN-5171.013.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-22 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.012.patch

Added more comments and improved resiliency of unit test.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch, YARN-5171.009.patch, YARN-5171.010.patch, 
> YARN-5171.011.patch, YARN-5171.012.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-22 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.011.patch

Fixing checktyles and naming.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch, YARN-5171.009.patch, YARN-5171.010.patch, 
> YARN-5171.011.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-21 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.010.patch

Adding RM check in the unit test.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch, YARN-5171.009.patch, YARN-5171.010.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-21 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.009.patch

Trying to extend the unit test with opportunistic checks.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch, YARN-5171.009.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-20 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340950#comment-15340950
 ] 

Inigo Goiri commented on YARN-5171:
---

[~subru], agreed on the naming thing; external can be confusing with YARN-5215.
We need to agree on a naming as this patch has a couple variables that make 
reference to it.
What about "distributed"? We also need to tweak {{ResourceUsage}} to do proper 
accounting.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-20 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.008.patch

Adding one simple unit test and tweaking the names again.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch, 
> YARN-5171.008.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-19 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.007.patch

Adding external to methods and changing {{ResourceUsage}}. Missing unit test.

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch, YARN-5171.007.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node

2016-06-16 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5171:
--
Attachment: YARN-5171.006.patch

Tackling some of the comments from [~kkaranasos].

> Extend DistributedSchedulerProtocol to notify RM of containers allocated by 
> the Node
> 
>
> Key: YARN-5171
> URL: https://issues.apache.org/jira/browse/YARN-5171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Inigo Goiri
> Attachments: YARN-5171.000.patch, YARN-5171.001.patch, 
> YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, 
> YARN-5171.005.patch, YARN-5171.006.patch
>
>
> Currently, the RM does not know about Containers allocated by the 
> OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the 
> Distributed Scheduler request interceptor and the protocol to notify the RM 
> of new containers as and when they are allocated at the NM. The 
> {{RMContainer}} should also be extended to expose the {{ExecutionType}} of 
> the container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-15 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332483#comment-15332483
 ] 

Inigo Goiri commented on YARN-5215:
---

[~kkaranasos] I kind of like the "fake container" approach. In addition, that 
would expose the real status of the cluster to the users and I think it would 
cover [~jlowe]'s concern. For distinguishing fake containers, we could make 
them be part of a fake application and replace the framework in the Application 
Type to something like EXTERNAL and the actual external load as the application 
name. We should leverage the unmanaged AM concept.

I also like this approach because we can create a default service that does the 
same thing I'm proposing in this patch and just update the size of the 
container based on node and containers utilization.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-15 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332021#comment-15332021
 ] 

Inigo Goiri commented on YARN-5215:
---

[~ccurino], converting this to an umbrella makes sense to me. However, I'm not 
sure we reached to an agreement even with the simplest approach (patch v001). 
Once we agree on the general approach, I would open the following subtasks:
* Schedule containers considering external load in the RM
* Expose external load in the UI
* Use history to smooth external load
* Preempting containers from the RM based on external load
* Preempting containers from the NM based on external load

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-14 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330759#comment-15330759
 ] 

Inigo Goiri commented on YARN-5215:
---

My initial proposal was to add a generic support for external resources. 
However, we could also follow the node-level agent approach which could even 
show as a unmanaged fake container. That solution is also OK with me.

Going a little bit deeper into the example, in our most extreme scenario, we 
would set the guaranteed to 0GB and the opportuinistic to 16GB.
In any case, if we go into preemption, then we should leverage what we are 
doing in YARN-1011.

Regarding YARN-5202, they use the concept of preemptable which is pretty much 
the same as the OPPORTUNISTIC one. Actually, in our internal deployment right 
now, we just assume that everything running on YARN is preemptable and we 
preempt the youngest container.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-14 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330649#comment-15330649
 ] 

Inigo Goiri commented on YARN-5215:
---

[~kasha], in our use case we are targeting co-locating with latency sensitive 
workloads and they have diurnal patterns. For this type of workload, we need to 
be fairly reactive. Actually, preempting containers at the NM following the 
{{ContainersMonitor}} loop would be ideal.

The improvements in utilization are significant as right now we are just 
reserving for the peak of the latency sensitive workloads (around ~50%) of the 
machine. We tried at some point to have a separate service to periodically 
change the resources of the NMs but it's harder to operate.

In any case, in this first patch, we are just preventing scheduling containers 
and not adding preemption. I can add the following changes to the current patch:
# UI improvements
# History in the utilization to take decisions
# Preempting containers from the RM
# Preempting containers from the NM

The problem with preemption is that we would go into what to preempt and that 
might have some dependencies in the opportunistic stuff in YARN-1011.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-13 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328631#comment-15328631
 ] 

Inigo Goiri commented on YARN-5215:
---

[~sunilg], I think your first two points are related. Let me try to give a full 
example of what it is now and what it would become with this. Right now, if we 
have a node with 16GB, we usually set the usable to the NM 
({{yarn.nodemanager.resource.memory-mb}}) to a smaller number like 14GB; the 
idea behind this is that the other services (NM itself and DN) can potentially 
use 2GB.

With this new approach, we can potentially set 
{{yarn.nodemanager.resource.memory-mb}} to 16GB and if the external processes 
consume 1GB, the NM can only allocate containers up to 15GB. Note that in our 
actual deployment, we set {{yarn.nodemanager.resource.memory-mb}} to something 
like 15GB to have some reserve to handle spikes.

Regarding your third point, we can add some kind of EWMA but I'm open to other 
proposals for smoothing spikes in the utilization numbers.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-09 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322829#comment-15322829
 ] 

Inigo Goiri commented on YARN-5215:
---

[~jlowe], my view on this feature is more like making the scheduler our of a 
"container" that is not managed by YARN. This is a dynamic size container. 
Before moving forward with this, we should agree that this is a proper 
abstraction and complementary to the overcommit effort.

Regarding the UI stuff, let's try to make a list of what to expose and I'll 
give it a try. My first idea is to expose the following per node/queue/cluster:
* Node utilization
* Container utilization
* External utilization
* Allocated/unallocated resources

Can you think on any other metric/aggregation we should be reporting?

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-09 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322829#comment-15322829
 ] 

Inigo Goiri edited comment on YARN-5215 at 6/9/16 4:38 PM:
---

[~jlowe], my view on this feature is more like making the scheduler aware of a 
"container" that is not managed by YARN and has a dynamic size. Before moving 
forward with this, we should agree that this is a proper abstraction and 
complementary to the overcommit effort.

Regarding the UI stuff, let's try to make a list of what to expose and I'll 
give it a try. My first idea is to expose the following per node/queue/cluster:
* Node utilization
* Container utilization
* External utilization
* Allocated/unallocated resources

Can you think on any other metric/aggregation we should be reporting?


was (Author: elgoiri):
[~jlowe], my view on this feature is more like making the scheduler our of a 
"container" that is not managed by YARN. This is a dynamic size container. 
Before moving forward with this, we should agree that this is a proper 
abstraction and complementary to the overcommit effort.

Regarding the UI stuff, let's try to make a list of what to expose and I'll 
give it a try. My first idea is to expose the following per node/queue/cluster:
* Node utilization
* Container utilization
* External utilization
* Allocated/unallocated resources

Can you think on any other metric/aggregation we should be reporting?

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5202) Dynamic Overcommit of Node Resources - POC

2016-06-08 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321810#comment-15321810
 ] 

Inigo Goiri commented on YARN-5202:
---

As mentioned in YARN-5215, I think this works fits pretty nicely within 
YARN-1011. [~jlowe], [~nroberts], do you guys have any issues on moving this 
work there? We could use most of this patch over there. For sure, all the UI 
stuff in this patch should be added to YARN-1011.

> Dynamic Overcommit of Node Resources - POC
> --
>
> Key: YARN-5202
> URL: https://issues.apache.org/jira/browse/YARN-5202
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-5202.patch
>
>
> This Jira is to present a proof-of-concept implementation (collaboration 
> between [~jlowe] and myself) of a dynamic over-commit implementation in YARN. 
>  The type of over-commit implemented in this jira is similar to but not as 
> full-featured as what's being implemented via YARN-1011. YARN-1011 is where 
> we see ourselves heading but we needed something quick and completely 
> transparent so that we could test it at scale with our varying workloads 
> (mainly MapReduce, Spark, and Tez). Doing so has shed some light on how much 
> additional capacity we can achieve with over-commit approaches, and has 
> fleshed out some of the problems these approaches will face.
> Primary design goals:
> - Avoid changing protocols, application frameworks, or core scheduler logic,  
> - simply adjust individual nodes' available resources based on current node 
> utilization and then let scheduler do what it normally does
> - Over-commit slowly, pull back aggressively - If things are looking good and 
> there is demand, slowly add resource. If memory starts to look over-utilized, 
> aggressively reduce the amount of over-commit.
> - Make sure the nodes protect themselves - i.e. if memory utilization on a 
> node gets too high, preempt something - preferably something from a 
> preemptable queue
> A patch against trunk will be attached shortly.  Some notes on the patch:
> - This feature was originally developed against something akin to 2.7.  Since 
> the patch is mainly to explain the approach, we didn't do any sort of testing 
> against trunk except for basic build and basic unit tests
> - The key pieces of functionality are in {{SchedulerNode}}, 
> {{AbstractYarnScheduler}}, and {{NodeResourceMonitorImpl}}. The remainder of 
> the patch is mainly UI, Config, Metrics, Tests, and some minor code 
> duplication (e.g. to optimize node resource changes we treat an over-commit 
> resource change differently than an updateNodeResource change - i.e. 
> remove_node/add_node is just too expensive for the frequency of over-commit 
> changes)
> - We only over-commit memory at this point. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321809#comment-15321809
 ] 

Inigo Goiri commented on YARN-5215:
---

[~curino], in our cluster we actually surface to the Web UI the utilization. We 
also report negative values for the available resources. However, I think we 
should do a better job exposing this information similarly to what [~jlowe] has 
done in YARN-5202.

I'll start a thread in YARN-1011 about how to expose all this.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5215:
--
Attachment: YARN-5215.001.patch

Adding configuration switch, boundary checks and unit tests.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321799#comment-15321799
 ] 

Inigo Goiri commented on YARN-5215:
---

Still fixing the unit test.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch, YARN-5215.001.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321672#comment-15321672
 ] 

Inigo Goiri commented on YARN-5215:
---

Yes, I realized that the original title didn't mention external load. Fixed 
now, sorry about that; I think it's more clear. Feel free to tweak the 
description more.

As you mention, we could achieve this by tweaking the "guaranteed" size. 
However, I think that having the explicit concept regarding external 
utilization makes it simpler and it's compatible with the overcommit approach 
(both can be enabled/disabled independently). In addition, the concept of node 
utilization is not planned to be used in YARN-1011 for now.

I'm going to post during the next hour a patch with:
* Unit tests
* Conf switches
* Boundary checks

Then, I agree that we need to report this properly to the user. I was thinking 
on exposing the {{getExternalUtilization()}} or the updated 
{{getUnallocated()}} through the Web UI, etc. If we decide this feature should 
go ahead, I would add here or in a new JIRA.

To summarize the issues to discuss/finalize are:
* Decide if this should be a separate feature or within overcommit
* Add unit tests
* Add conf switches
* Add boundary checks
* Interface to expose this information

Regarding YARN-5202 vs YARN-1011, it looks to me like there's a lot of overlap 
between them. I think it'd be better to port most of YARN-5202 into YARN-1011. 
We probably should move this discussion into one of them.

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5215:
--
Description: Currently YARN runs containers in the servers assuming that 
they own all the resources. The proposal is to use the utilization information 
in the node and the containers to estimate how much is consumed by external 
processes and schedule based on this estimation.  (was: Currently YARN runs 
containers in the servers assuming that they own all the resources. The 
proposal is to use the utilization information in the node and the containers 
to estimate how much is actually available in the NMs.)

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is consumed by external processes and 
> schedule based on this estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5215) Scheduling containers based on external load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5215:
--
Summary: Scheduling containers based on external load in the servers  (was: 
Scheduling containers based on load in the servers)

> Scheduling containers based on external load in the servers
> ---
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is actually available in the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5215) Scheduling containers based on load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321355#comment-15321355
 ] 

Inigo Goiri commented on YARN-5215:
---

This could be a subtask in YARN-1011. However, I thought it was isolated enough 
to make it independent. I think YARN-5202 also goes into the same overcommit 
direction. To clarify the differences to YARN-1011: here we propose to estimate 
the utilization from external processes (e.g., HDFS DataNode or Impala daemons) 
and schedule based on that, so I think is orthogonal to the overcommit work. 
Happy to move it to a subtask if you guys think it makes more sense there.

Regarding the questions from [~curino]:
# This is just at scheduling time and with the proposed approach we will 
allocate the same or less as the current schedulers; so it's conservative in 
terms of scheduling and the only issue is it wouldn't use as many resources as 
it could. The estimation is: {{externalUtilization = nodeUtilization - 
containersUtilization}} and given that the container and node utilization are 
captured at different intervals, we could have containersUtilization > 
nodeUtilization; I think adding a check for negative values should be enough. 
In any case, I don't see issues with enforcing the resources as the only thing 
we do is estimating the external utilization.
# This patch only prevents scheduling, further discussion in #4.
# As I mentioned in the first paragraph, I think this is orthogonal to 
overcommit, we can run this without overcommiting resources and just prevent 
impact on the external load. If overcommitting is enabled, we can still play 
this trick.
# For the functionality described in this patch, this is it; we can open other 
tasks to do preemption at (1) scheduler level and (2) NM level. We could add 
the first one to this patch if needed.
# We have variations of this implemented and running in our cluster for the 
last year. In our scenario, we have other latency sensitive load running in 
those machines, and we want to guarantee they get as many resources as they 
need. Regarding unit testing, I can try to play with the MiniYarnCluster to 
fake external load; it shouldn't be too bad to extend 
{{TestMiniYarnClusterNodeUtilization}}.

(I tried to use bq to reply but it got messy, I hope this is comprehensive.)

Just to highlight my point on the major discussion, I think this can be a 
subtask of YARN-1011 but it's orthogonal to overcommit.

> Scheduling containers based on load in the servers
> --
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is actually available in the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5215) Scheduling containers based on load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-5215:
--
Attachment: YARN-5215.000.patch

First version to show the idea.

> Scheduling containers based on load in the servers
> --
>
> Key: YARN-5215
> URL: https://issues.apache.org/jira/browse/YARN-5215
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Inigo Goiri
> Attachments: YARN-5215.000.patch
>
>
> Currently YARN runs containers in the servers assuming that they own all the 
> resources. The proposal is to use the utilization information in the node and 
> the containers to estimate how much is actually available in the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5215) Scheduling containers based on load in the servers

2016-06-08 Thread Inigo Goiri (JIRA)
Inigo Goiri created YARN-5215:
-

 Summary: Scheduling containers based on load in the servers
 Key: YARN-5215
 URL: https://issues.apache.org/jira/browse/YARN-5215
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Inigo Goiri


Currently YARN runs containers in the servers assuming that they own all the 
resources. The proposal is to use the utilization information in the node and 
the containers to estimate how much is actually available in the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   3   >