[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263511#comment-15263511 ] Varun Saxena commented on YARN-3959: bq. One advantage of having it in the YARN application entity is that it enables querying (and filtering) for configs across different application types. Thats correct. Let's have it in both places then. Similar to what Naga had done in MAPREDUCE=6424. Thoughts ? bq. Regarding handling the size, the following may be necessary As discussed in the meeting, this might be doable even before 1st milestone if we reach a consensus on what to do ? Have internal fixed limits or have configurable limits ? We need to decide on limits too. Say 100kb or 200kb. Number of configs per entity may not be a very good critieria IMO here. If we have done something in hRaven, we can probably adopt the same approach if conflicting opinions do not exist. bq. If there is no generic way of doing this, I think we might want to re-word this JIRA to make it specific to mapreduce Yes, this JIRA Is intended for Job configurations only. Will move it to MAPREDUCE. Moreover, we might need some thought on how to do it from YARN framework. Maybe provide some additional interface from RM/NM collector. Me and Naga did have a discussion surrounding writing these metrics and configs as part of YARN_APPLICATION a couple of days ago. Should we allow AMs' to write anything for YARN specific entity types ? Spurious AMs' can possibly overwrite metric values or config values or other things. If we are relying on this history data for making some decision, it may be a problem.This may be more a case in public cloud scenario instead of controlled environments. But this anyways goes into the realm of ACLs' and who can write what and how, which we haven't given a thought on as yet. Will have a look at Li's suggestion on MAPREDUCE-6424 and move this JIRA to MAPREDUCE. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263361#comment-15263361 ] Sangjin Lee commented on YARN-3959: --- [~Naganarasimha], [~varun_saxena], your thoughts? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263337#comment-15263337 ] Li Lu commented on YARN-3959: - bq. If there is no generic way of doing this, I think we might want to re-word this JIRA to make it specific to mapreduce. I'd be +1 on moving this JIRA to the MAPREDUCE project. That's fine. Another of my concern is, as I raised in MAPREDUCE-6424, seems like now we're having getters in HistoryEvents for events and metrics. I'm wondering if my suggestion in that JIRA (providing a method like addDataToEntity) would also help for configs? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263234#comment-15263234 ] Sangjin Lee commented on YARN-3959: --- [~varun_saxena], one interesting thing about the configuration. To which entity should it be added? I see your current POC patch adds it to the MR job entity. The alternative is to add it to the YARN application entity. One advantage of having it in the YARN application entity is that it enables querying (and filtering) for configs across different application types. I don't think we specified where the config should go. Thoughts? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263174#comment-15263174 ] Sangjin Lee commented on YARN-3959: --- {quote} Are we posting configurations for all YARN applications, or we just do that for MapReduce apps? Actually, if we have system level support to post configs for all YARN applications, we do not need to change much on the MR side, right? I think configs are very general for YARN apps, so maybe we can fix that on YARN's level rather than MR level? {quote} I am not sure if there is a YARN-generic way of writing the configuration, regardless of the frameworks. First of all, the notion of configuration is not universal and its existence/format/data is up to the framework. For example, distributed shell does not have its own configuration, MR has configuration ({{JobConf}}) which extends {{Configuration}}, and Spark has its own independent configuration ({{SparkConf}}) which does not derive from {{Configuration}}. Also, even if such a configuration existed, I'm not sure if they are ever sent to the RM, etc. so it can be written out to the timeline service in a single place. I'd be curious to hear your thoughts on possible mechanisms. If there is no generic way of doing this, I think we might want to re-word this JIRA to make it specific to mapreduce. I'd be +1 on moving this JIRA to the MAPREDUCE project. Regarding handling the size, the following may be necessary: - split writing the configuration into multiple writes - limit the overall size of the configuration (beyond which keys/values will be dropped?) - limit the size of individual values (beyond which the said key/value will be dropped/truncated?) This needs a little bit of design consideration (cc [~jrottinghuis] [~vrushalic] for the hRaven experience). As we discussed offline, IMO it is acceptable to do a simple write for now but handle the large configuration issue in a later JIRA. I'd like to hear what others think. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263082#comment-15263082 ] Li Lu commented on YARN-3959: - Hi [~varun_saxena], thanks for the prompt work! Some of my comments: - Are we posting configurations for all YARN applications, or we just do that for MapReduce apps? Actually, if we have system level support to post configs for all YARN applications, we do not need to change much on the MR side, right? I think configs are very general for YARN apps, so maybe we can fix that on YARN's level rather than MR level? - I would argue that a "size limiter" is almost a must for this feature, since configs may get abused on some clusters (and people may not even realize that). Of course the default payload size is fine, but something could be very wrong if those configs are too big. We may want to have a two level limit for storing configs: for each config we limit its max length, and we also limit the maximum number of configs we can attach to one timeline entity. We can make these two limits configurable on the YARN level. If we generalize this patch to YARN level (as the title), we no longer need to rely on MAPREDUCE-6424 for the fix. Thoughts? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262400#comment-15262400 ] Varun Saxena commented on YARN-3959: Updated a patch. I am publishing all entities together as job entity when JOB_SUBMITTED event arrives. If we publish all entities together, payload size would increase by around 45kb(with default configs). Should be fine I guess. Or do we want to break it up into a fixed a number of configs being published in one entity(say 200 configs) ? Moreover, this patch is on top of MAPREDUCE-6424 as it touches similar areas of code. I had a brief glance at it and approach looks fine to me hence created this patch on top of it. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3959-YARN-2928.01.patch > > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250401#comment-15250401 ] Li Lu commented on YARN-3959: - Hi [~varun_saxena], since our planned data of the 1st milestone is approaching, do you have free bandwidth on this issue before the ddl? I've got some free bandwidth and may help on this issue if needed. Thanks. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230253#comment-15230253 ] Varun Saxena commented on YARN-3959: I mean should be doable for 1st milestone. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229792#comment-15229792 ] Varun Saxena commented on YARN-3959: [~sjlee0], IIUC this is for reporting configs from MapReduce AM. This should ideally be a MAPREDUCE JIRA. I think this should be doable during 1st milestone. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229676#comment-15229676 ] Sangjin Lee commented on YARN-3959: --- I think it would be nice if we can get this in. [~varun_saxena]? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227653#comment-15227653 ] Naganarasimha G R commented on YARN-3959: - I think we can remove this from the *"yarn-2928-1st-milestone"* list, Thoughts? > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208436#comment-15208436 ] Junping Du commented on YARN-3959: -- Sure. [~varun_saxena], please go ahead. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207998#comment-15207998 ] Varun Saxena commented on YARN-3959: [~djp], I can work on this issue if you are not planning to work on this in short term, as this is marked for 1st milestone. Do let me know. > Store application related configurations in Timeline Service v2 > --- > > Key: YARN-3959 > URL: https://issues.apache.org/jira/browse/YARN-3959 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Labels: yarn-2928-1st-milestone > > We already have configuration field in HBase schema for application entity. > We need to make sure AM write it out when it get launched. -- This message was sent by Atlassian JIRA (v6.3.4#6332)