[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919256#comment-16919256 ] Rohith Sharma K S commented on YARN-4821: - [~abmodi] Feel free to assign yourself as no activity for long time other community members > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R >Priority: Major > Labels: YARN-5355 > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730222#comment-16730222 ] Abhishek Modi commented on YARN-4821: - [~Naganarasimha] if you are not actively working on it, can I take this. > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R >Priority: Major > Labels: YARN-5355 > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888479#comment-15888479 ] Haibo Chen commented on YARN-4821: -- [~Naganarasimha], have you got cycles to continue on this jira? > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: YARN-5355 > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389775#comment-15389775 ] Naganarasimha G R commented on YARN-4821: - Sure [~sjlee0], Let me rebase the patch and will start the discussion > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: YARN-5355 > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389721#comment-15389721 ] Sangjin Lee commented on YARN-4821: --- [~Naganarasimha], it might be time to pick this up again? Would you like to refresh the discussion and the patch? Thanks. > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: YARN-5355 > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267826#comment-15267826 ] Naganarasimha G R commented on YARN-4821: - Thanks [~sjlee0], It was majorly WIP patch usingthe existing classes just to check if the approach is fine, i agree with most of your comments bq. My preference would be to use a true multiple. If we're going to emit every n-th time, we should let users define n as the config. It was oversight it was intended to be as you described, i will correct it in next patch bq. It would add to the garbage collection pressure. Agree thought of using existing class but seems tob not appropriate here will rework on it and coming up new class we can have arrays for each resource type and have some variable to indicate number of monitor data already stored and have a compute avg logic which avoids creation of unnecessary intermediate objects. bq. What if we are still accumulating (without publishing) when the container is finished? I had plans to publish when the container has finished with what ever existing data available, this was one edge case which i was aware of and thought of handling at the end :) bq. let's instantiate this map in serviceInit() only if NM timeline publisher is enabled agree bq. I am puzzled by this. Why are we re-defining this memory metric to be in MB? Is this necessary as part of this patch? when using REsource utilization i realized they were converting into MB and then accumulating at container level and sending across to RM. I felt this solution was correct and we too should capture MB only as in most of the cases we containers will be configured in MB (and default value also is KB) so felt there was no use in capturing bytes level data as it would not be of much use as mostly we will be handling in MB's and GB's. Also it would unnecessary storage of data. Thoughts? > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267794#comment-15267794 ] Sangjin Lee commented on YARN-4821: --- Thanks [~Naganarasimha] for the patch! I took a look at the patch, and I have some feedback and questions. I suspect there is more subtlety than it appears initially. (1) config Currently it is defined as "number of monitors to skip before publishing to timeline service." However, this is not too user-friendly. For example, if this value is 3, it really means we would publish every *4-th* time if I read this correctly. This may not be too apparent to users. My preference would be to use a true multiple. If we're going to emit every {{n}}-th time, we should let users define {{n}} as the config. Can we redefine this config? Also, I would suggest the default value of 5 as discussed previously. 5 gives us a nice round number of 15 seconds (4 times a minute). Also, can you please add it to {{yarn-default.xml}} too? (2) keeping track of intermediate values I am somewhat concerned about the amount of objects that get instantiated (and discarded) during this process. For every live container and every monitoring interval, it would create a new {{ResourceUtilization}} object, and also another {{ResourceUtilization}} object when you average. It would add to the garbage collection pressure. Can we come up with a different way of keeping track of intermediate values? How about creating a data class that lets you accumulate values (as opposed to having a {{List}})? That way, a single live container needs only one data class object. Also, we won't have to store things like vmem which we're not tracking anyway. (3) edge cases What if we are still accumulating (without publishing) when the container is finished? I think we simply don't publish? I suspect that might be OK. We might want to make it explicit with a little comments. Any other edge case we need to consider? Onto more line-level comments: (ContainersMonitorImpl.java) - l. 85: It should be private. Also, let's instantiate this map in {{serviceInit()}} only if NM timeline publisher is enabled (i.e. timeline service v.2 is enabled). - l.579: I am puzzled by this. Why are we re-defining this memory metric to be in MB? Is this necessary as part of this patch? If there is no reason to redefine the unit, we should go back to bytes... > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4821-YARN-2928.v1.001.patch > > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230903#comment-15230903 ] Naganarasimha G R commented on YARN-4821: - Thanks for the comments [~vinodkv], bq. We should completely decouple these two. If the publishing-interval is configured to be not a multiple of the monitoring-interval, the publisher could only look at the last N values from the monitor before the last cycle. As we discussed in the meeting, IMHO i thought its much simpler for user to configure just the multiple of monitoring interval after which the ATS event will be published for the resource usage. If not user needs to be made aware of the relation between publishing-interval and monitoring interval. So it would be something like *monitoring interval = 3 seconds, publish frequency= 5*, then after 3*5 =15 seconds, average of 5 values will be published . May be i can come up with a WIP patch based on this and discuss whether its fine Will go through YARN-3332 before working on the patch. > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) Have a separate NM timeline publishing-interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230813#comment-15230813 ] Vinod Kumar Vavilapalli commented on YARN-4821: --- bq. This proposal is simply to use a different publishing interval just for the timeline publishing +1. We should completely decouple these two. If the publishing-interval is configured to be not a multiple of the monitoring-interval, the publisher could only look at the last N values from the monitor before the last cycle. Can you also please have a read at YARN-3332 and see if you can organize code in a bit of independent way? A related data point for deciding the interval itself - the Hadoop Metrics plugin pulls metrics from all of our daemons and pushes them out periodically - with a default value of 10 sec IIRC. This is the periodicity for most of the production clusters. Assuming adding container-metrics data to this still keep the total outgoing data to the same or immediate order of magnitude (say 250 metrics per NM + (50 containers * 50 metrics)), we should be okay with the same frequency. Anything more frequent will need careful benchmarking. > Have a separate NM timeline publishing-interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229677#comment-15229677 ] Sangjin Lee commented on YARN-4821: --- I think it would be good if we can get this in, as long as it is not too complicated to implement. What do you think? > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227654#comment-15227654 ] Naganarasimha G R commented on YARN-4821: - [~sjlee0], i think we can remove this from the *"yarn-2928-1st-milestone"* list. Thoughts ? > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221831#comment-15221831 ] Naganarasimha G R commented on YARN-4821: - Thanks [~sjlee0] for detailing your idea. bq. More importantly, if one needs to modify the resource monitoring interval, he/she should be aware of the implication it would have on the timeline publishing, or it's easy to miss out that connection and make a mistake. IMO in the other approach/example which you mentioned also requires user to consider the relation between resource monitoring interval and the publishing interval, as events are not published as per the publishing interval. So in the approach which i have proposed it would be easy for user to understand and configure as events will be published based on multiple of resource monitoring interval. like as per your example user needs to just configure "3" . But i agree that if he is not aware that timeline publishing is tied with resource monitoring interval then he might miss to reconfigure if he configures the later. bq. We could also consider different intervals for CPU and memory, although one could argue that the YARN resource monitoring does not do that so we probably don't need to differentiate them. That's just my 2 cents Yeah i agree to keep it simple we can keep the same but for example if its 10 seconds then might be cpu usage we dont get the better picture of actual utilization. > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214880#comment-15214880 ] Sangjin Lee commented on YARN-4821: --- I thought about what you're suggesting too. However, I'm not sure if it would be the most useful approach. First, the timeline publishing interval would be a function of the resource monitoring interval, so you'd need to look up two values to figure out how often you publish the timeline data. More importantly, if one needs to modify the resource monitoring interval, he/she should be aware of the implication it would have on the timeline publishing, or it's easy to miss out that connection and make a mistake. The main concern with this is to control the amount data you write as I suspect this might be one of the more copious volumes of data we write. How about a simple time-based publishing? Let's say the resrouce monitoring interval is 3 seconds, and the timeline publishing interval is 10 seconds. Then we could keep track of the last publishing time and use that to ensure we don't publish more often than 10 seconds. The following might be one example. || time || resource monitoring || timeline publishing || | 0 | yes | yes | | 3 | yes | | | 6 | yes | | | 9 | yes | | | 12 | yes | yes | | 15 | yes | | | 18 | yes | | | 21 | yes | | | 24 | yes | yes | | 27 | yes | | | 30 | yes | | This is one idea, but we could have increasingly more sophisticated ideas. For example, we could also remember the regular intervals, and write one data even if it's less than 10 seconds from the previous writes so that we have fairly regular writes happening (in the above example, it would be at 0 seconds, 12 seconds, 21 seconds, and 30 seconds). We could also consider different intervals for CPU and memory, although one could argue that the YARN resource monitoring does not do that so we probably don't need to differentiate them. That's just my 2 cents. > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207633#comment-15207633 ] Naganarasimha G R commented on YARN-4821: - [~sjlee0], I was thinking more in the lines : for how many container resource monitors we publish the container usage value for example for every 2nd monitor publish the usage metrics so if monitoring interval is 3 seconds then we publish metrics every 6 seconds. In this way it would be better, as publishing interval cannot be independent as it depends on container resource monitoring period. And also other thing to note here i can understand this approach(having more publish interval) for Memory but CPU would it make sense to collect after gaps like n intervals say for example 2 intervals of 3 seconds gap, then we would be collecting for every 6 seconds but usually cpu usage is not constant for such a long time right ? Would it be required to handle differently for CPU and Memory ? > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195899#comment-15195899 ] Sangjin Lee commented on YARN-4821: --- I don't think it needs to be that sophisticated. It can be as simple as looking at the last publishing time, and decide whether to emit the value at hand. There may well be more interesting details to this, but let's figure it out here. > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195891#comment-15195891 ] Sangjin Lee commented on YARN-4821: --- Please feel free to assign it to yourself. I'm fine with it. This proposal is simply to use a different publishing interval just for the timeline publishing. We should not consider changing the nature of the CPU usage metric (or memory) as part of this. They are still basically gauges (instant reading of the values). For the NM timeline publishing interval, I don't think we need to have a super-strict publishing interval ("it must be exactly N seconds between publishing"). I think it is perfectly fine if it is "publishing should not be more often than every N seconds". The main purpose is to control the volume (or speed) or writes. > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195884#comment-15195884 ] Naganarasimha G R commented on YARN-4821: - ??is it we capture instantaneous CPU usage or CPU for the period NMTimelinePublisher publishing?? => I mean like suppose *resource-monitor.interval-ms* is 3 seconds and cpu metrics ATS publishing interval is 6 seconds then we just take what ever reported by *ContainerMonitorInterval* after 6 seconds or we store the intermediate values average it and then push it ? And also how to deal if *resource-monitor.interval-ms* is 3 seconds and ATS metrics publish interval is 4/5 seconds ? > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4821) have a separate NM timeline publishing interval
[ https://issues.apache.org/jira/browse/YARN-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195874#comment-15195874 ] Naganarasimha G R commented on YARN-4821: - Hi [~sjlee0], One query wrt this is how to calculate the cpu usage as the duration could be different, is it we capture instantaneous CPU usage or CPU for the period NMTimelinePublisher publishing ? Also would like to work on this jira > have a separate NM timeline publishing interval > --- > > Key: YARN-4821 > URL: https://issues.apache.org/jira/browse/YARN-4821 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee > Labels: yarn-2928-1st-milestone > > Currently the interval with which NM publishes container CPU and memory > metrics is tied to {{yarn.nodemanager.resource-monitor.interval-ms}} whose > default is 3 seconds. This is too aggressive. > There should be a separate configuration that controls how often > {{NMTimelinePublisher}} publishes container metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)