[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-1529:
-
Fix Version/s: 3.1.5
   3.3.1
   3.4.0
   2.10.1
   3.2.2

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-1529-branch-2.10.001.patch, YARN-1529.005.patch, 
> YARN-1529.006.patch, YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-1529:
--
Attachment: YARN-1529-branch-2.10.001.patch

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529-branch-2.10.001.patch, YARN-1529.005.patch, 
> YARN-1529.006.patch, YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-1529:
--
Attachment: YARN-1529.006.patch

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2016-08-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1529:
-
Attachment: YARN-1529.v04.patch

I've attached a version 4 of the patch upmerged to trunk which is what we're 
using internally.  It's heavily derived from Gera's patch.

I agree that writing the metrics to ATS would be interesting and useful, but 
I'm not sure we should tie NM-level localization metrics and container-level 
metrics together in one JIRA.  We've found the node-level aggregated metrics 
very useful on their own.  As such I'm thinking we might want to proceed in 
this JIRA with the aggregated container localization metrics in the NM and move 
the per-container metrics in ATS to a separate JIRA.  That way we can get some 
of the benefits sooner (and on clusters that don't have ATS configured or 
prepared to handle the extra load).

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Chris Trezzo
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-03 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v03.patch

addressing javadoc warning

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
 YARN-1529.v03.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v02.patch

Moved YARN-changes from MAPREDUCE-5696

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: (was: YARN-1529.v02.patch)

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v02.patch

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2013-12-23 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v01.patch

{noformat}
$ curl -s 
http://somehost:8042/jmx?qry=Hadoop:service=NodeManager,name=NodeManagerMetrics;
 | python -mjson.tool 
{
beans: [
{
AllocatedContainers: 0,
AllocatedGB: 0,
AvailableGB: 8,
ContainersCompleted: 1,
ContainersFailed: 0,
ContainersIniting: 0,
ContainersKilled: 1,
ContainersLaunched: 2,
ContainersRunning: 0,
LocalizationDownloadNanos: 1803959000,
LocalizedBytesCached: 1529454,
LocalizedBytesCachedRatio: 49,
LocalizedBytesMissed: 1529546,
LocalizedFilesCached: 2,
LocalizedFilesCachedRatio: 33,
LocalizedFilesMissed: 4,
modelerType: NodeManagerMetrics,
name: Hadoop:service=NodeManager,name=NodeManagerMetrics,
tag.Context: yarn,
tag.Hostname: somehost
}
]
}
{noformat}

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2013-12-23 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Issue Type: Improvement  (was: Bug)

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)