[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074661#comment-15074661 ] Varun Saxena commented on YARN-4224: Thanks [~leftnoteasy]. I suspected the same as cluster and flow info is not returned in every response. Just wanted a clarification as this point(whether server should send UID) came up during ATSv2 call. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074267#comment-15074267 ] Wangda Tan commented on YARN-4224: -- Hi [~varun_saxena], I would prefer to response UID to REST client from server. Yes, Ember could fetch more contexts to build UID, but additional efforts need to be done, and it might have to make additional REST calls to get full contexts. I think it's better to keep as less information stored in client as as possible. Please let me know if it makes sense to you. Thanks, > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072067#comment-15072067 ] Varun Saxena commented on YARN-4224: [~leftnoteasy], wanted a confirmation. From an Ember perspective, despite we making delimiters and how we construct UIDs' public, would we still need to send UID from server(ATS reader) ? Or ember can construct the UID by itself ? Probably not because when we respond to apps or app or entities query, the response would not have flow information(which is necessary to construct UID) so some context information has to be maintained I guess, which as you said in one of your comments is not easy to do in Ember. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072068#comment-15072068 ] Li Lu commented on YARN-4224: - Yes let's move forward with this for now. We can still make some quick cosmetic changes when the patch is relative good to go. Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072066#comment-15072066 ] Varun Saxena commented on YARN-4224: Yeah we can go with optional query param for entity type and not support it in HBase implementation, for now. Would it be fine though to have an optional param with our current implementation wanting it to be mandatory or do we assume that for flexibility we keep it optional, I think we can discuss this in our next meeting and take a final call on it. Should have actually taken a final call in last meeting itself but I missed that this might be a point which we would need to discuss. Whether we keep entity type as optional or not, entities endpoint will have a clash, so lets go with adding -uid. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072061#comment-15072061 ] Li Lu commented on YARN-4224: - I think it's still depending on if we would like to model entity type as one stage of the location of the resource, rather than if a parameter is mandatory or not. If we add entity type as a part of the end point, it mean that in our system, user should locate an timeline entity through .../app/app_id/entity_type/entity_id. I'm actually fine with adding this one level, but my point is to make sure we're consistent on the location of this resource (in everywhere of our REST APIs), not to just open a special case for our HBase implementation. [~leftnoteasy]'s point on returning an error would also work. It just leaves flexibility for future implementations. BTW, I think adding -uid is a good idea to distinguish the two types of endpoints. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072060#comment-15072060 ] Varun Saxena commented on YARN-4224: [~leftnoteasy], replies to one of your comments. bq. Since we're using UIDs for these objects, I feel that adding "-uid" to API is not necessary to me and could potentially confusing people. This is necessary because hierarchical URL clashes with UID based URL. If we make entity type as an optional parameter, it should be optional for both hierarchical and UID based URLs'. So a URL for hierarchicial endpoint of entities would look like. {{/ws/v2/timeline/apps/\{appid\}\entities}} And for UID based endpoint it will be {{/ws/v2/timeline/apps/\{app UID\}\entities}} This leads to a clash. Probably we can have a single endpoint and interpret whether its UID or app id itself, but wouldnt that be confusing too ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072057#comment-15072057 ] Wangda Tan commented on YARN-4224: -- Hi [~varun_saxena]/[~sjlee0], Apologize for my very late response, I'm traveling recently. My thoughts: 1) For semantics vs. implementation I agree we need to consider implementation. For now, for the flat (what I proposed) API, I think we don't need to support more semantics than what Varun has proposed for hierarchy APIs: https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15070133&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15070133. Which should not bring any new issues to backend implementation. I think it's better to make user API extensible, since we could continuously improve backend implementation, we don't have to break API in the future. 2) For entity-type in query parameter. As I mentioned above, I'm fine to only support entity-type specified queries, queries without entity-type specified will receive error response. We can relax this limitation once our backend supports it and real requirement arrives. 3) For support flat API and hiearchy API together I'm fine with support them together if both of them are required. Since web UI needs flat API, are you OK with supporting flat API first? 4) bq. At any rate, I agree that due to the possibility of omission ambiguities are perhaps possible. In that case, I suspect using different query nouns might be the ultimate solution (e.g. "apps" for the hierachical and "apps-uid" for UIDs). Since we're using UIDs for these objects, I feel that adding "-uid" to API is not necessary to me and could potentially confusing people. Thoughts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070736#comment-15070736 ] Sangjin Lee commented on YARN-4224: --- +1. It is a valid point that we don't want implementations to affect our interfaces too much. That said, however, it is also the case that there is probably no strong use case where one would want all types of entities that belong to a given app. The fact that the v.1 REST API also requires the entity type is a pretty good case in point. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070658#comment-15070658 ] Varun Saxena commented on YARN-4224: BTW, even in ATSv1 REST endpoint for fetching multiple entities looks like {{/ws/v1/timeline/\{entitytype\}}} which means multiple entities are returned within the scope of entity type. So there might not be a use case for this. Anyways in v2 we can change that with the knowledge that queries without entity type maybe slow with HBase implementation. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070649#comment-15070649 ] Varun Saxena commented on YARN-4224: bq. For API design, we don't want implementations to affect our interfaces too much That is a fair point. But then our main implementation of HBase may not be able to support it with good performance. And frankly if we keep entity type as optional query param, shouldn't we keep it optional even for hierarchical endpoint ? Why only for UID endpoint. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070634#comment-15070634 ] Varun Saxena commented on YARN-4224: In short, limit to number of entities to return wont have any impact on number of rows to scan. We will have to scan all possible rows for that row prefix. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070632#comment-15070632 ] Varun Saxena commented on YARN-4224: bq. At any rate, I agree that due to the possibility of omission ambiguities are perhaps possible. In that case, I suspect using different query nouns might be the ultimate solution (e.g. "apps" for the hierachical and "apps-uid" for UIDs). Although, it sounds awkward, I am leaning towards it as well > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070630#comment-15070630 ] Varun Saxena commented on YARN-4224: Well, we can have a generic REST query without entity type as well. I was just saying that it would require scanning quite a bit of entity table. We would not want users to use it. If we make it public, users may use it and that query maybe slow. If that is acceptable, then we can go ahead wit it. I doubt we will be able to support querying generic entities(i.e. querying all entities irrespective of entity type for an app) in a faster manner, especially if users write a lot of generic entities, even in future. Thoughts [~sjlee0] ? bq. we can even say return one random entity within the given application Yes, but that has to be scoped within entity type. Entity ID + Entity type uniquely identify an entity, merely entity ID doesnt. bq. "on this end point I assume you want all entities for this application, but to avoid crash myself I'm only returning a part of it" looks fine Again, as I mentioned the problem here is how do we stop. We guarantee entities will be ordered by created time. So we will have to scan all possible records for that row prefix to return rows upto a limit(100 or 200 or whatever). If scanning so many records is acceptable, we can have it, supporting it as such wont be a big issue. We should ideally restrict scanning as less records as possible(for default use case) because it can be a performance hog. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070575#comment-15070575 ] Sangjin Lee commented on YARN-4224: --- Regarding the ambiguity between /ws/v2/timeline/apps/\{app UID\}/entities/\{entitytype\} (UID) and /ws/v2/timeline/apps/app_id/entities/entitytype (hierachical), doesn't the hierarchical URL need more context such as cluster/user/flow/flow-run? Is it because all of them can be omitted? At any rate, I agree that due to the possibility of omission ambiguities are perhaps possible. In that case, I suspect using different query nouns might be the ultimate solution (e.g. "apps" for the hierachical and "apps-uid" for UIDs). > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070306#comment-15070306 ] Li Lu commented on YARN-4224: - Thanks [~leftnoteasy]! I agree that we should separate the semantics and implementations. Our web UI, as one user of the REST API, does not really need general queries for timeline entities (I can always attach an entity type if needed). However, as from the API design perspective, I'd hope our API to be general enough. Having APIs like "list all entities within one application" may seems too ambitious for implementations, but something like "on this end point I assume you want all entities for this application, but to avoid crash myself I'm only returning a part of it" looks fine. However, enforcing an entity type to all such queries and add them as part of the end point looks a little bit suboptimal (it also changes the way we organize resources). > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070302#comment-15070302 ] Wangda Tan commented on YARN-4224: -- Hi [~varun_saxena], [~gtCarrera], bq. Currently query without entity type is not supported I feel that we should split API-design and internal implementation, it is quite possible that web UI wants to make a single RPC call, pull more rich application entities (aka, all entities in one app), and render charts locally. It's fine if the currently implementation doesn't support it, we can return bad response if we cannot support now. But it will be important to make a extensible REST API that we can support it in the future without semantics change. Thoughts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070295#comment-15070295 ] Li Lu commented on YARN-4224: - Well, return first 100 entities is just one example (we can even say return one random entity within the given application, for example). For API design, we don't want implementations to affect our interfaces too much. Entity type is not a mandatory part of an entity query, so we can keep it as optional for entity queries. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070266#comment-15070266 ] Varun Saxena commented on YARN-4224: An important point. For entity table, the row keys are not sorted by created time. So when we fetch records from HBase, a limit of 100 for instance does not mean that we stop after fetching first 100 records. We will continue fetching records till row prefix matches and keep on removing the last entity based on created time to limit entities to 100. So, quite a few rows are scanned. If we do not make entity type mandatory, this would mean scan of even more rows, especially when for generic entity table, entity type can be anything. So I would prefer to have a check for entity type being required mandatorily. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070256#comment-15070256 ] Varun Saxena commented on YARN-4224: bq. For UID based queries we can pass type as a query parameter. For the hierarchical endpoints, type is modeled as a part of entity ids (we have to do this to uniquely id an entity). IIUC, you mean that we can have endpoint as /ws/v2/timeline/apps/\{app UID\}/entities?entityType=... for UID and, endpoint as /ws/v2/timeline/apps/\{app UID\}/entities/\{entitytype\} for hierarchical REST URL. Lets reach an agreement on this then Frankly a query without entity type wont be very useful, but lets do this for differentiation. Any issues in making a check for entityType not being supplied though(other than that it is a query param) ? Currently query without entity type is not supported. Some changes, although minor, will have to be made in storage layer for this. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070248#comment-15070248 ] Varun Saxena commented on YARN-4224: Yes, we can have a writeup. This will be useful during eventual documentation as well. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070246#comment-15070246 ] Li Lu commented on YARN-4224: - After all these discussion, I think it will be helpful to come up with a write up for our REST API designs. We can post the write up here so that it's much simpler to have a big picture of our reader REST APIs? I can certainly help on this. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070243#comment-15070243 ] Li Lu commented on YARN-4224: - Yes, I think this is fine for entities. The root cause of this is that entities need both id and type to be uniquely identified. For UID based queries we can pass type as a query parameter. For the hierarchical endpoints, type is modeled as a part of entity ids (we have to do this to uniquely id an entity). The clash will happen if we hit the .../apps endpoint, and we have to distinguish those two cases. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070240#comment-15070240 ] Varun Saxena commented on YARN-4224: Pls note this is specific to entities endpoint only. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070237#comment-15070237 ] Varun Saxena commented on YARN-4224: Or we can let it clash. If string has decided delimiters, we consider it as UID, otherwise app id. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070232#comment-15070232 ] Varun Saxena commented on YARN-4224: Well for hierarchical endpoint, we have something like {{/ws/v2/timeline/apps/\{appid}/entities/\{entityType\} as endpoint. Shouldnt they be consistent ? If they are consistent, they will clash. Maybe for UID, we can go with query param for entity types because UID endpoint will primarily be called from UI and entity type always supplied. Default limit for number of entities is 100. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070229#comment-15070229 ] Li Lu commented on YARN-4224: - Actually I think the /ws/v2/timeline/apps/{app UID}/entities?entityType=... format looks fine. On querying entities, entity types is a query parameter but may not be mandatory. /ws/v2/timeline/apps/{app UID}/entities semantically should list all entities in one application. Implementation-wise, this may not be a good idea since there may be too many entities. There are solutions to this problem. For example, we can restrict /ws/v2/timeline/apps/{app UID}/entities will always return first 100 entities. With this design, if users would like to list all CONTAINER type entities, they can add entityType as one query parameter. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070221#comment-15070221 ] Varun Saxena commented on YARN-4224: Correction - "For UID, we can put entity type as query param and for hierarchical endpoint put entity type a path param." But thats not consistent. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070218#comment-15070218 ] Varun Saxena commented on YARN-4224: Another option would be to make entities endpoint as {{/ws/v2/timeline/apps/\{app UID\}/entities?entityType=...}}. However this will be a mandatory param(there will be check at server side). Pls note that hierarchical REST endpoint has been kept as {{/ws/v2/timeline/apps/\{appid\}/entities/\{entitytype\}}}. Pls note app UID and app id are not the same thing. We need some differentiation between UID endpoint and hierarchical endpoint because if we follow general scheme the endpoints will clash. Although mandatory params in REST are part of path param generally but I guess we have no other option here. For UID, we can put entity type as query param and hierarchical endpoint a path param. Its confusing anyways. Or should we have endpoints like {{/ws/v2/timeline/runsUID/\{run UID\}/apps}}, {{/ws/v2/timeline/appsUID/\{app UID\}}}, {{/ws/v2/timeline/appsUID/\{app UID\}/entities/\{entitytype\}}}, thereby clearly indicating that UID is being passed and avoiding conflict as mentioned above. Thoughts ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070212#comment-15070212 ] Varun Saxena commented on YARN-4224: Thanks [~leftnoteasy]. For entities endpoint would /ws/v2/timeline/apps/\{app UID\}/\{entitytype\} be fine for UI ? This would be a slight deviation from other endpoints because entity type cannot be put as part of UID in previous(parent) response For querying app attempts entity type will be YARN_APP_ATTEMPT and for containers it will be YARN_CONTAINER i.e. endpoints will basically be /ws/v2/timeline/apps/\{app UID\}/YARN_APP_ATTEMPT and /ws/v2/timeline/apps/\{app UID\}/YARN_CONTAINER respectively. I dont think in UI we will be displaying all possible generic entity types. Only app attempts and containers will be required. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070198#comment-15070198 ] Wangda Tan commented on YARN-4224: -- Thanks [~varun_saxena], Synced with [~gtCarrera] about this, I think it's fine to me to have two hierarchy ({{.timeline/\{parent\}/childrens}} to locate entities such as apps within a flow, flowruns within a flow. I don' have strong opinion between the two-hierarchy API OR adding parent-id to query parameter ({{timeline/apps/flowrun=\{flowrun_uid\}}}. The most important things to me for the REST API is allowing client locate single object at one hierarchy (such as {{timeline/flowruns/\{flowrun_uid\}}}. I think we're on the same page for this. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070141#comment-15070141 ] Varun Saxena commented on YARN-4224: Sorry for entities, we cannot really have an endpoint as /ws/v2/timeline/apps/\{app UID\}/entities/\{entitytype\} because this will clash with hierarchical endpoint for entities. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070133#comment-15070133 ] Varun Saxena commented on YARN-4224: Based on tonight's discussion, UID endpoints can look as under : {panel} *Query multiple flows* : Endpoint is */ws/v2/timeline/flows or /ws/v2/timeline/\{clusterid\}/flows*. This query will return a UID of the form *cluster:user:flowname* for each flow name. *Query multiple flowruns* : Endpoint is */ws/v2/timeline/flows/\{flow UID\}/runs* where flow UID is of the form *cluster:user:flowname* i.e. the one returned in query above. This query returns a UID of the form *cluster:user:flowname:runid* for each flow run. *Query single flowrun* : Endpoint is */ws/v2/timeline/runs/\{flowrun UID\}* where flowrun UID is of the form *cluster:user:flowname:runid* i.e. the one returned in query above. This query also returns a UID of the form *cluster:user:flowname:runid* for the flowrun returned. *Query multiple apps in a flowrun* : Endpoint can be */ws/v2/timeline/runs/\{flowrun UID\}/apps* where flowrun UID is of the form *cluster:user:flowname:runid*. This query also returns a UID of the form *cluster:user:flowname:runid:appid* for each app ret urned. *Query single app* : Endpoint can be */ws/v2/timeline/apps/\{app UID\}* where app UID is of the form *cluster:user:flowname:runid:appid* i.e. the one returned in query above. *Query Entities* : Endpoint can be */ws/v2/timeline/apps/\{app UID\}/entities/\{entitytype\}* or */ws/v2/timeline/apps/\{app UID\}/\{entitytype\}*. Thoughts ? Entity type is separate because we cannot know entity type when we query apps.This query also returns a UID of the form *cluster:user:flowname:runid:appid:entitytype:entityid* for each entity returned. *Query Entity* : Endpoint can be */ws/v2/timeline/entities/\{entity UID\}* where entity UID is of the form *cluster:user:flowname:runid:appid:entitytype:entityid* {panel} * One more question we need to discuss is whether UID is really important to be sent from timeline reader ? Or client can construct it. Basically can Ember construct it ? Please note that things like users, flows, etc. i.e. flow context information will not be available in app query or entity query response. So Ember cannot easily fetch it from REST response. Or would it be easier for Ember if UID came in response. If UID has to come in response, we can probably elevate it to TimelineEntity as an extra field. Also as discussed, construction of UID can be done in Timeline Reader Manager instead of storage layer. cc [~sjlee0], [~gtCarrera9], [~leftnoteasy] Lets reach a consensus and conclude this before holidays. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070006#comment-15070006 ] Sangjin Lee commented on YARN-4224: --- Sorry I am catching up with the discussion. Just to put my opinions on some of the questions raised so far. Regarding omitting some part of the path in the hierarchical form of the URL: bq. Sangjin Lee did you mean providing shortcuts to thing like applications (instead of cluster, user, flow, flowrun, app, we can directly have cluster and app)? Yes, for example, when you query for things like all apps in a flow run, it is possible to omit things like "user" as it can be inferred from the rest of the information. Although the path is /cluster/user/flow/flow-run-id/apps, I was hoping one could do /cluster/flow/flow-run-id/apps and the server will accept it as long as it can infer the missing path from the rest of the context. The UID form would have to specify all parts of the information with no exception however to eliminate any ambiguity. I hope that answers the question. Regarding creating the UID, I think we still need to make a call on whether to make the UID composition a public protocol. If we do, then potentially we don't need to return anything and don't have to worry about in which layer in the server-side it will be composed. On a related note, I'm leaning against making the UID composition configurable. I don't see a whole lot of practical need to customize UID composition, and it will only cause more confusion especially when a user/client deals with multiple clusters. On specifying the entity type along with the entity's UID, I think it would definitely better if not required. My memory is bit hazy on this, but I think there is no hard guarantee that an entity id is unique even within a parent yarn app. Entity id's are essentially up to whoever writes them, and they may choose degenerate id's. I think we always said only the tuple of (entity type, entity id) is unique within an application, right? So, what is the required info for uniquely locating an entity? Entity type, and entity id are needed, but how about the context? App id? Any flow contexts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069951#comment-15069951 ] Varun Saxena commented on YARN-4224: To aid in tonight's discussion, I will jot down the REST endpoints added and points to discuss. [~gtCarrera9], if you have suggestion on these endpoints, you can jot them down here as well. So that we can have a faster discussion during call. * REST endpoints based on UID as per current patch are as under : {panel} *Query multiple flows* : Endpoint is */ws/v2/timeline/flows or /ws/v2/timeline/\{clusterid\}/flows*. This query will return a UID of the form *cluster:user:flowname* for each flow name. *Query multiple flowruns* : Endpoint is */ws/v2/timeline/runs/\{flow UID\}* where flow UID is of the form *cluster:user:flowname* i.e. the one returned in query above. This query returns a UID of the form *cluster:user:flowname:runid* for each flow run. *Query single flowrun* : Endpoint is */ws/v2/timeline/run/\{flowrun UID\}* where flowrun UID is of the form *cluster:user:flowname:runid* i.e. the one returned in query above. This query also returns a UID of the form *cluster:user:flowname:runid* for the flowrun returned. Is this required for Web UI ? *Query multiple apps in a flowrun* : Endpoint is */ws/v2/timeline/runapps/\{flowrun UID\}* where flowrun UID is of the form *cluster:user:flowname:runid*. runapps because we are querying apps within a flowrun. Hierarchical endpoint has one endpoint to query apps within a flow name as well. This query also returns a UID of the form *cluster:user:flowname:runid:appid* for each app returned. *Query single app* : Endpoint is */ws/v2/timeline/app/\{app UID\}* where app UID is of the form *cluster:user:flowname:runid:appid* i.e. the one returned in query above. *Query Entities* : Current endpoint is */ws/v2/timeline/entities/\{entitytype\}/\{app UID\}*. Entity type is separate because we cannot know entity type when we query apps. This was decided to be endpoint when we had decided separator will not be public. Now as it will be public, endpoint can probably be */ws/v2/timeline/entities/\{app UID plus entity type\}* i.e. UID will be *cluster:user:flowname:runid:appid:entitytype*. But for this specific query, client needs to specifically do extra operation on UID returned in previous query, unlike other endpoints. This query also returns a UID of the form *cluster:user:flowname:runid:appid:entitytype:entityid* for each entity returned. *Query Entity* : Endpoint is */ws/v2/timeline/entity/\{entity UID\}* where entity UID is of the form *cluster:user:flowname:runid:appid:entitytype:entityid* {panel} * Need to discuss pros and cons of filling UID inside storage layer and outside it. We can add an endpoint for single flow once offline aggregation is done. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069190#comment-15069190 ] Varun Saxena commented on YARN-4224: bq. It's about locating resources in the system. /flows will be a query endpoint and we can send query parameters there, but /flows/{uid} will locate one single flow. What I'm confusing right now is, why do we need to have both plural and singular forms /run/{uid} and /runs/{uid}? Will they locate to the same run, given the same UID? Ok. The way UID endpoints have been set up in the patch are {{\{resource to query\}/\{uid required to query that resource\}}}. So run means single run to query and runs means multiple runs to query. How do we differentiate between different endpoints then ? UID is not required to locate multiple flows, flowruns, apps and entities ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069172#comment-15069172 ] Varun Saxena commented on YARN-4224: bq. So the UID in /entities/{entitytype}/{uid}/ is actually app UID? This make the whole endpoint looks really weird... I thought it's an entity UID to locate to one timeline entity. However, I think you raised a very useful use case to query a certain type of entity for one application. Maybe we'd like to change the format of this endpoint to address this case? I don't really feel like the current form of the endpoint... When I had written the code,I was assuming delimiter wont be public so UID has to madatorily go from server. And we cant append entity type there. As we have reached consensus on making delimiter public, UI can actually append the entity type in front of UID. But this will have to be a special case for entities endpoint. bq. So to find one entity with cluster, user, flow, flowrun, appid and entity id, we do not have the hierarchical endpoint, but can only get an entity through the UID interface? Do we need the hierarchical interface for CLIs? Not really. In case of hierarchical endpoint, flow context information can be supplied as part of optional query parameters. This would preclude the need of query flow context. But for hierarchical queries we envisage direct queries too rather than only flows->flowruns->apps->entities sequence when query via UID. But now as delimiter can be public, UID endpoint can also have a direct query potentially. So can have similar structure as for hierarchical query. I think we can discuss all these points in detail in today's call. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068950#comment-15068950 ] Li Lu commented on YARN-4224: - Thanks Varun. Some of my comments: bq. You mean just have one endpoint and based on delimiters in UID, decide whether to fetch single entity or multiple ? It's about locating resources in the system. /flows will be a query endpoint and we can send query parameters there, but /flows/\{uid\} will locate one single flow. What I'm confusing right now is, why do we need to have both plural and singular forms /run/\{uid\} and /runs/\{uid\}? Will they locate to the same run, given the same UID? bq. Because, the query before query for entities, in which we return UID, will be query for apps. This query is from application table where we do not have entity related information. So I cannot send all the possible entity types for an entity or a list of entities in the response. Hence when we query list of entities, it is within the scope of app UID and entity type. Hence entity type has to be specified. bq. Yes, if we try to query all entity types and related entities, this would require scanning quite a bit of the entity table which can grow quite big. And from UI, I envisage only queries for APP_ATTEMPT and CONTAINER so we would know the entity type. So the UID in /entities/\{entitytype\}/\{uid\}/ is actually app UID? This make the whole endpoint looks really weird... I thought it's an entity UID to locate to one timeline entity. However, I think you raised a very useful use case to query a certain type of entity for one application. Maybe we'd like to change the format of this endpoint to address this case? I don't really feel like the current form of the endpoint... bq. Moreover, runs endpoint will do it for you i.e. fetch all flowruns for a flow. OK this works for now. In future, if flows are associated with flow level aggregation data, we will need endpoints to retrieve flow level data. We can skip this step for our first milestone though. bq. There is. The \/entity\/{uid}\/ endpoint. I hope this is what your question was. So to find one entity with cluster, user, flow, flowrun, appid and entity id, we do not have the hierarchical endpoint, but can only get an entity through the UID interface? Do we need the hierarchical interface for CLIs? We can certainly discuss more of these in our meeting tomorrow. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067939#comment-15067939 ] Varun Saxena commented on YARN-4224: Formatting issue for one of the comments. Writing it here again. bq. 3. Seems like there is no full path to locate one entity from the cluster, user, flow, run, app, entity type, and entity id. Are we omitting this endpoint deliberately? There is. The {{\/entity\/\{uid\}\/}} endpoint. I hope this is what your question was. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067935#comment-15067935 ] Varun Saxena commented on YARN-4224: [~gtCarrera9], {quote} 1. I understand we would like to have plural forms for listing (like /flows, /apps) and singular forms for detail (like /flow/{uid}). But then, why do we need both /runs/{uid} and /run/{uid}? The same question also applies to apps. {quote} You mean just have one endpoint and based on delimiters in UID, decide whether to fetch single entity or multiple ? {quote} For entities, we require both UID and type. Why type is not a part of UID (which means UID is not sufficient to identify an entity)? {quote} Because, the query before query for entities, in which we return UID, will be query for apps. This query is from application table where we do not have entity related information. So I cannot send all the possible entity types for an entity or a list of entities in the response. Hence when we query list of entities, it is within the scope of app UID and entity type. Hence entity type has to be specified. {quote} Or, are you planning to support operations like "list all entities in a given entity type"? {quote} Yes, if we try to query all entity types and related entities, this would require scanning quite a bit of the entity table which can grow quite big. And from UI, I envisage only queries for APP_ATTEMPT and CONTAINER so we would know the entity type. {quote} If it is the latter, then do we want to consider put type into query parameters on end point entities? {quote} Normally in REST, mandatory params are kept as part of path. I expect entity type to be a mandatory param. I am assuming we do not want to support queries like get entities for all possible entity types. Thoughts ? bq. For flows, why we' re not including an UID endpoint to locate one flow? This poses a challenge when we'd like to list all flow runs within one flow (or, do we have any other end points to do this work? ). When we query flows, we return all possible flow runs with it. So query for a single flow is not going to give any new information. Moreover, runs endpoint will do it for you i.e. fetch all flowruns for a flow. {quote} 3. Seems like there is no full path to locate one entity from the cluster, user, flow, run, app, entity type, and entity id. Are we omitting this endpoint deliberately? {quote} There is. The {{\/entity\/{uid}\/}} endpoint. I hope this is what your question was. {quote} As a side note, in this patch there are 3 types of "shortcuts" in the URL: omit the cluster id (with default cluster id), omit user id (with default user id) and directly access app id. I'm OK with direct accessing app ids (with cluster id), but do we want to omit the other two? Comments are more then welcome. {quote} Can you elaborate a bit on this ? We have 3 endpoints. One with cluster id, one without cluster id(default cluster from config is taken) and one with UID. For apps, the UID endpoint will contain flow context information. bq. I'm also debating with myself on this. Right now I'm leaning towards to make the UIDs transparent to the storage layer. I am fine either ways. I can see pros and cons for both. Only concern I see is that if we are fetching a lot of entities and specify a high limit, we need to iterate over all the entities again to fill the UID. If it is a mere 50-100 entities then should be negligible difference but what if its very high. Another thing we need to ponder is that whether we need to support pagination for UIs' and if it would be possible to support it. Because then we will have to store some contextual information in reader. Or we can send some info back in response and continue from there for next pagination request. That is handle pagination by ourselves. Not sure if we can do this before 1st milestone though. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067263#comment-15067263 ] Li Lu commented on YARN-4224: - Thanks for the work [~varun_saxena]! A few quick concerns: 1. I understand we would like to have plural forms for listing (like /flows, /apps) and singular forms for detail (like /flow/\{uid\}). But then, why do we need both /runs/\{uid\} and /run/\{uid\}? The same question also applies to apps. 2. For endpoints with UIDs, we need to work with flow, flowrun, app, and entity. I notice we have such code for flowrun (run) and app. For entities, we require both UID and type. Why type is not a part of UID (which means UID is not sufficient to identify an entity)? Or, are you planning to support operations like "list all entities in a given entity type"? If it is the latter, then do we want to consider put type into query parameters on end point entities? For flows, why we' re not including an UID endpoint to locate one flow? This poses a challenge when we'd like to list all flow runs within one flow (or, do we have any other end points to do this work? ). 3. Seems like there is no full path to locate one entity from the cluster, user, flow, run, app, entity type, and entity id. Are we omitting this endpoint deliberately? 4. As a side note, in this patch there are 3 types of "shortcuts" in the URL: omit the cluster id (with default cluster id), omit user id (with default user id) and directly access app id. I'm OK with direct accessing app ids (with cluster id), but do we want to omit the other two? Comments are more then welcome. bq. 3. We have 2 options. Either set UID in TimelineReaderManager or in the storage implementation . Advantage of former is that we are delinking UID implementation from backend storage implementation. Disadvantage is that we need to iterate over all the entities again to set UID. If we choose latter, it is the reverse. We can set UID while creating entities. But any new storage implementation needs to take care of filling UID then. I have as of now implemented the second option. Not yet added that UID needs to be filled in javadoc. bq. 4. Also UID is being returned as of now in both UID endpoint queries and non UID endpoint queries. Send UID only for former ? I'm also debating with myself on this. Right now I'm leaning towards to make the UIDs transparent to the storage layer. Since UIDs will be added as an info field, it's more like an attachment to the original entities, but not a part of them. This can also keep writers easy (enforcing writers to add some data to all written entities looks a little awkward? ). Thoughts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067160#comment-15067160 ] Hadoop QA commented on YARN-4224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 33s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 46s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 6 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 72, now 77). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 43s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 35s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:7c86163 | | JIRA Patch URL | https://is
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067056#comment-15067056 ] Varun Saxena commented on YARN-4224: Updated a WIP patch. Few points. Would like to get views on them so that we can reach agreement. 1. UID is filled in info even if fields do not indicate INFO has to be returned. Moreover, the key is fixed as "UID". Should we make it configurable or documenting it would be enough ? User should not send any info key as UID then. 2. Same goes for UID delimiter ? Should it be configurable ? 3. We have 2 options. Either set UID in TimelineReaderManager or in the storage implementation . Advantage of former is that we are delinking UID implementation from backend storage implementation. Disadvantage is that we need to iterate over all the entities again to set UID. If we choose latter, it is the reverse. We can set UID while creating entities. But any new storage implementation needs to take care of filling UID then. I have as of now implemented the second option. Not yet added that UID needs to be filled in javadoc. 4. Also UID is being returned as of now in both UID endpoint queries and non UID endpoint queries. Send UID only for former ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066917#comment-15066917 ] Li Lu commented on YARN-4224: - bq. We will pass optional parameters as part of query param. Correct ? I think that's the case. [~sjlee0] did you mean providing shortcuts to thing like applications (instead of cluster, user, flow, flowrun, app, we can directly have cluster and app)? bq. I am frankly fine with making the delimiter and how we construct UIDs' public. I'm also fine with it after putting some thoughts. It looks inevitable since we need to expose the ways we form the UIDs to the users anyways. Since we're reaching agreements on most of the important factors of this issue, maybe we can kick off the work on this JIRA? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065296#comment-15065296 ] Varun Saxena commented on YARN-4224: bq. permit omitting part of the path that can be omitted (need documentation on the permitted cases) Can you elaborate on that ? We will pass optional parameters as part of query param. Correct ? Also do you think we can consider keeping a configuration for UID delimiter while alerting that user needs to do correct encoding if he choses a reserved character. Thoughts ? I am frankly fine with making the delimiter and how we construct UIDs' public. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064931#comment-15064931 ] Sangjin Lee commented on YARN-4224: --- Thanks [~gtCarrera9] for the update! Then I'd like to put forward a proposal more formally (it's not a new proposal). - adopt Li's original proposal (2nd approach mentioned in [this comment|https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15052865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052865]) - permit omitting part of the path that can be omitted (need documentation on the permitted cases) - also support a UID-based URL as a shorthand for the long path-based URL, but *clearly document what type of queries support UIDs* I am still not 100% convinced that we should not make composing UIDs public (so that clients themselves can compose them, instead of a server-based end point). This is my proposal/opinion, so obviously yours may be different. Thoughts? Comments? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061186#comment-15061186 ] Li Lu commented on YARN-4224: - Discussed about the PUT use case with Wangda. Right now we're not planning any write use case for the web UI, especially when we assume all data comes from the timeline reader server. Therefore, let's focus on the GET operations and make sure those endpoints are right. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059690#comment-15059690 ] Varun Saxena commented on YARN-4224: Please note that we can directly get app attempt and container report from CLI as well so scheme of getting entity UID from call for entities won't work. For this we can either go with the hierarchical structure or have a explicit separate endpoint for attempts and containers where we can extract app id from app attempt id/container id. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059687#comment-15059687 ] Varun Saxena commented on YARN-4224: As we have a call today, I think we can discuss this in detail there. I will consolidate the points for the sake of discussion. * For Ember UI, hierarchical format of URL is not desirable. Refer to [comment above | https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15056762&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15056762]. * So the proposal is to treat the parameters required to make a query as a tuple(represented as a UID which has the these parameters delimited by some delimiter). Also the idea is to fetch information in a hierarchical fashion. That is flows -> flowruns -> apps. So the query flow would look something like this. {panel} *Get flows* The URL in this scheme will be same as before i.e. {{/ws/v2/timeline/flows}}. While returning these flows we will also send a list of flowruns. Pls note this would be based on activity on each date. Now the proposal is that for each flow we can send a UID to aid further queries. This can be filled in INFO field. The UID will look like {{cluster_id|user_id|flow_name}} if pipe(|) is the delimiter. We also return flow runs for each in the same query. So for each flowrun as well we can attach a UID which would then be {{cluster_id|user_id|flow_name|flow_run_id}} *Get flowruns* Now to get flowruns for specific flow we can have endpoint as {{/ws/v2/timeline/flowruns/\[flow_UID\]}} where flow_UID is what we returned in query above i.e. {{cluster_id|user_id|flow_name}}. Similar to above here for each flowrun we will fill a UID as {{cluster_id|user_id|flow_name|flow_run_id}} IIUC, for a query(multiple records) UID may not be necessary in Ember but lets keep it for consistency. Wangda can confirm though. *Get single flowrun* The endpoint here would be {{/ws/v2/timeline/flowrun/\[flowrun_UID\]}} where flowrun_UID is what we returned in query above i.e. {{cluster_id|user_id|flow_name|flow_run_id}}. Similar to above here for each flowrun we will fill a UID as {{cluster_id|user_id|flow_name|flow_run_id}} For Ember UI though, the call to get all flowruns may not be necessary as getFlows may suffice to get flowruns for a flow. *Get apps* We can either get list of apps under a flow or a flowrun. Assuming UI will use the hierarchical query, lets say we will query apps under flowrun. So endpoint can be {{/ws/v2/timeline/flowrunapps/\[flowrun_UID\]}} where flowrun_UID is {{cluster_id|user_id|flow_name|flow_run_id}}. Here we need to decide if we want to fill the UID or not. We can fill the UID for each app as {{cluster_id|user_id|flow_name|flow_run_id|app_id}} but cluster id and appid should be enough to query an app. We can have cluster id as an optional query param. But if we pass flow information for this query, a peek into the flow context table wont be required. So need to discuss more. *Get app* To get an app we can either use the app_UID(containing flow) or just use the appid with cluster as optional query param. Endpoint can be either {{/ws/v2/timeline/app/\[app_UID\]}} or of the form {{/ws/v2/timeline/app/appid\{?clusterid=zzz\}}}. This depends on whether we keep a flat URL structure or both *Get entities* Similar to get app in terms of UID requirements. Endpoint can be either {{/ws/v2/timeline/entities/\[app_UID\]/entity_type}} or of the form {{/ws/v2/timeline/app/appid/entity_type\{?clusterid=zzz\}}} I have kept entity_type for this query in path as this cant be included in UID when we query app. And entity_type is a mandatory param so it should ideally be in path. As part of this query response we can construct a entity_UID of the form {{cluster_id|user_id|flow_name|flow_run_id|app_id|entity_type|entity_id}} or {{cluster_id|app_id|entity_type|entity_id}} if we exclude flow context info. *Get entity* Using the UID returned above we can make the query to get a single entity. But kindly note that we can query a single app attempt or single container as well from CLI. In this case, this scheme wont work. {panel} * As Sangjin said, we can support both the flat URL for UI and normal REST hierarchical URL for other clients. This is what I lean towards as well. If that's done there were proposals made in [this comment|https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15052865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052865]. Although what I proposed will lead to shorter URL, but as Li pointed out the format he proposed exists in AHS already. So I will go with it as well to keep it consistent all across. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN >
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059591#comment-15059591 ] Varun Saxena commented on YARN-4224: Coming to [~gtCarrera9]'s points, bq. Therefore, for applications, other than accessing through the hierarchical cluster, user, flow, flowrun order, we can directly access applications through /apps/appid. This will also help us to integrate with other YARN components. The key difference here is that, for YARN applications, their app id is actually an UID. Yes appid can uniquely identify an app within a cluster. For ATSv2 though we plan to host data from multiple clusters as well. So a query might originate from different clusters. On the reader side, we take the cluster id from config if none is supplied. Consider we have 2 clusters but a single ATS reader. The use case for CLI is that it falls back on ATS if app is not found in RM. Now if our query from CLI does not contain cluster id the cluster timeline reader belongs to will be taken(from config). But ideally behavior should be same no matter which cluster we query from. So we can probably have cluster id as an optional query param here. CLI can read it from config. Can Ember UI do however ? bq. If there is an UID for each entity, why do we need to add entity type as one more layer of ID? We would not require it for entity endpoint but for entities endpoint. When we return app, we cant include entity type in UID(for entities query) as for a generic entitiy, entity type can be anything. bq. If we'd like to query YARN_CONTAINER entities for a given application, maybe we'd prefer to support this in a query like: "/entities?type=YARN_CONTAINER&appid=my_app_id"? We normally keep mandatory parameters in the path for REST. Entity type and appid are mandatory parameters to make entities query, so it is preferable to have them in the path. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059562#comment-15059562 ] Varun Saxena commented on YARN-4224: [~leftnoteasy], Once we get entities, for entity endpoint, we can easily attach entity type in UID. For entities endpoint though, we cannot put entity type in UID when we return app. Because entity type can be anything. For UI, it may just be app attempt and container. But the way ATS is designed we allow for storing any generic entity with any entity type. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059420#comment-15059420 ] Sangjin Lee commented on YARN-4224: --- I'm a little late to this thread, and have just started to catch up on the discussion, so I might be off-base. But let me ask some questions and also chime in on some points. I actually thought that [Li's original proposal|https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15052865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052865] (2nd approach in that comment) was quite good. Although it can be a little verbose, it's strongly resource-based and very consistent. Also, I liked the fact that some parts of the path may be omitted (e.g. users) if it can be inferred from other information (which we already support). With that *single* rule, it can describe just about any URL here. It can't be any clearer than that. As for the proposal for creating a single token by concatenating cluster, user, and the flow name, is the goal basically to avoid multiple levels in the URL path (to aid the UI implementation)? Then what about the flow run id? For example, let's consider querying for all apps in a flow run. With Li's original proposal, it would be {noformat} /ws/v2/timeline/clusters/yarn_cluster/users/admin/flows/hive_flow/runs/123/apps {noformat} (I swapped cluster and user) With the concatenation proposal, it would become {noformat} /ws/v2/timeline/yarn_cluster_admin_hive_flow/runs/123/apps {noformat} (As Varun pointed out, there is the sticky issue of escaping the concatenation character, but let's set that aside for the moment) We still have "/runs/123" appended after that UID. Is that going to be fine with the UI implementation? Or does that also need to be concatenated so that we have something like {noformat} /ws/v2/timeline/yarn_cluster_admin_hive_flow_123/apps {noformat} ? I think this has a potential of making things a lot more complicated than is needed. I don’t think it’s easy or desirable to flatten everything into a single token at all times. Also, how about cases where some part of the information can be omitted (e.g. user, flow name, flow run, etc.)? Then how should we form the UID? Would we require user/UI to always specify all the parts? It may not be always feasible. Also, I'm not sure if I like the idea of having an end point just to return the UID given the bits. It would make the workflow a lot more complicated (the client needs this handshake before it can start querying), and I'm not sure what we gain by hiding that part into the server. If we were to do this, I think we might as well make this a public piece of information so any user or UI can compose it quickly. That would make things a whole lot easier to implement on both sides. How about the following proposal? Can we adopt Li's original proposal and also support the UID-based pattern to aid the UI? The UID can be considered more like a short-hand notation, but would *complement (not replace)* the basic REST-style pattern. But we should clearly spell out under which condition such concatenated UIDs are supported in order to eliminate any ambiguity (e.g. only "cluster+user+flowname", or "entity_type+entity_id" too). It shouldn't be too difficult for the server to support both modes, and we would retain most of the simplicity that Li's first proposal has yet be able to facilitate the UI implementation. What do you think? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059340#comment-15059340 ] Wangda Tan commented on YARN-4224: -- [~varun_saxena] thanks for reply. bq. Now for entity endpoint we can then have UID from result of entities endpoint. So entity type is not really required here. But should we have it for sake of consistency ? I would prefer to have a flat REST API for entities just like others. If entity_type + entity_uid can uniquely locate a entity, why not directly add the entity_type to entity_uid as well? Just like YARN's existing ids, "container_" is the type of container object, and "application_" is the type of application object. Is there any concern if you include entity_type to entity_uid? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059082#comment-15059082 ] Li Lu commented on YARN-4224: - Thanks [~varun_saxena]! The plan on the query parameters looks fine to me now. We can firstly focus on the end points in this JIRA, then move to detailed query parameter design in future. With regard to the new end points, a few feedbacks: bq. this pretty much makes us get the entities in a hierarchical fashion i.e. first get flows, then runs, then apps, and then generic entities. Because we want server to decide the delimiter instead of us fixing it. Yes and no. For flows and flow runs, yes because a flow's name and sequence number is not globally unique. When discussing flows and flow runs, we have to put them in a context with cluster and user information. However, for applications, its app id is globally unique (it has a cluster timestamp and an id). Therefore, for applications, other than accessing through the hierarchical cluster, user, flow, flowrun order, we can directly access applications through /apps/appid. This will also help us to integrate with other YARN components. The key difference here is that, for YARN applications, their app id is actually an UID. bq. So we can have endpoints for entities as /entities/[entity_type]/[entities_UID] where UID is the UID we got from app response. If there is an UID for each entity, why do we need to add entity type as one more layer of ID? If we'd like to query YARN_CONTAINER entities for a given application, maybe we'd prefer to support this in a query like: "/entities?type=YARN_CONTAINER&appid=my_app_id"? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058215#comment-15058215 ] Varun Saxena commented on YARN-4224: Coming to the implementation, this pretty much makes us get the entities in a hierarchical fashion i.e. first get flows, then runs, then apps, and then generic entities. Because we want server to decide the delimiter instead of us fixing it. This is fine for UI but in some use cases we might have to query app information directly. For instance, from CLI we fall back on AHS/ATS currently if app or containers are not found in RM. We have to do the same in ATSv2. In this case we have nothing to do with flows. And on the basis of user and app id(which we can have in YarnClient), ATSv2 can determine the flow context and complete the query. Similarly, for getting containers from CLI, all we need is user, app id and entity type(YARN_CONTAINER). So getting the hierarchy(starting from flows) wont be required here. We can construct UID by ourselves in client by fixing the delimiter globally. Or pick it up form a config file. Or have the hierarchical REST endpoints as well in addition to UID ones for such cases. Or have a uid endpoint. This needs to be discussed. cc [~sjlee0], [~djp]. Maybe we can discuss this in tomorrow's meeting as well. Also, when we get flows, we can return UID for each flowrun in response. For each flowrun then we can then query all apps. And then for each app have another UID attached. But when we query generic entities in ATSv2, we also have an extra parameter named entity type which we cannot send with each app(because generic entity type can be anything theoretically). So we can have endpoints for entities as {{/entities/\[entity_type\]/\[entities_UID\]}} where UID is the UID we got from app response. In UI we can hardcode the entity type(currently it would be containers and app attempts). Now for entity endpoint we can then have UID from result of entities endpoint. So entity type is not really required here. But should we have it for sake of consistency ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057912#comment-15057912 ] Varun Saxena commented on YARN-4224: [~leftnoteasy], ok then what I was thinking was on the right lines. Coming to your points, bq. Not sure if it is possible to support queries like: give me flows which users satisfy a given regex, and begin/end time is from a range? Could you give me an example about what does the query look like? We may not be able to support queries where users satisfies a particular regex(until and unless we do some client side processing in timeline reader) because user is generally part of row key. But yes we can limit the results and support a host of query parameters. Although for every endpoint this optional query parameters would vary but it would be generally a subset of below. Typically we can trim results based on created time range, modified time range, relationships, event filters(based on existence of some events), config and info filters(matching KV Pairs) and metric filters. {panel} *URI Optional Query Parameters :* *limit* - Number of entities to return. *createdtimestart* - If specified, matched entities should not be created before this timestamp. *createdtimeend* - If specified, matched entities should not be created after this timestamp. *modifiedtimestart* - If specified, matched entities should not be modified before this timestamp. *modifiedtimeend* - If specified, matched entities should not be modified after this timestamp. *relatesto* - If specified, matched entities should relate to given entities associated with a entity type. relatesto is a comma separated list in the format \[entitytype\]:\[entityid1\]:\[entityid2\]...For eg : relatesto=type1:entity1:entity2,type2:entity3 *isrelatedto* - If specified, matched entities should be related to given entities associated with a entity type. relatesto is a comma separated list in the format \[entitytype\]:\[entityid1\]:\[entityid2\]...For e.g., isrelatedto=type1:entity1:entity2,type2:entity3 *infofilters* - If specified, matched entities should have exact matches to the given info represented as key-value pairs. This is represented as infofilters=info1:value1,info2:value2... *conffilters* - If specified, matched entities should have exact matches to the given configs represented as key-value pairs. This is represented as conffilters=conf1:value1,conf2:value2... *metricfilters* - If specified, matched entities should contain the given metrics. This is represented as metricfilters=metricid1, metricid2... *eventfilters* - If specified, matched entities should contain the given events. This is represented as eventfilters=eventid1, eventid2... *fields* - Specifies which fields of the entity object to retrieve All fields will be retrieved if fields=ALL. If not specified, 4 fields i.e. entity type, id, created time and modified time is returned. {panel} A typical query based on old URL scheme would look like : {noformat} http://localhost:8188/ws/v2/timeline/entities/cluster1/app1/app?metricfilters=metric7&isrelatedto=type1:tid1_1;tid1_2,type2:tid2_1%60&relatesto=flow:flow1&eventfilters=event_2,event_4&; infofilters=info2:3.5&createdtimestart=1425016502030&createdtimeend=1425016502060 {noformat} This would change though. After YARN-3862, we can now support fetching specific configs and metrics. We can do this by specifying config and metric prefixes. This has not yet been hooked up to REST layer. I will do so in YARN-4447. Also once YARN-3863 goes in, we will be able to support SQL type queries containing ANDs' and ORs' between filters. And have queries like {{metric1 > 30 AND metric2 < 100}} . I will hook this up to REST layer again via YARN-4447. Also in future plan to trim down the metrics to return(as they can be quite a lot) based on time range. This can be done by YARN-4455. Although we need to discuss if modified time can be used for this use case. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056762#comment-15056762 ] Wangda Tan commented on YARN-4224: -- Hi [~varun_saxena], For your last comment: bq. So I looked at Wangda Tan's code at YARN-3368. I see that for single record like a single app attempt, we are extending urlForFindRecord and that takes only a single string id as input instead of an object as is the case with urlForQuery. In case of app attempt and containers, we can get both appid from app attempt id, and app attempt from container so a single id would do. That's the major reason why I asked to support flat namespace in REST API. Yes you're correct, front JS library could support multi layer hierarchy REST API, but it's very painful. We have to extend JS library to support it, and we need to keep context of objects (in your case we need username/cluster-id/flow-id when try to get flow related info). This is very painful from my experience on writing web UI. bq. Moreover, what do you mean by batch query ? Does that mean support for multiple optional query parameters like filters etc. to trim down the results ? We already have them. Not sure if it is possible to support queries like: give me flows which users satisfy a given regex, and begin/end time is from a range? Could you give me an example about what does the query look like? In addition, I'm planning to propose adding flat namespace REST APIs to RM side as well (and keep existing REST APIs in RM unchanged for compatibility). For example, we should be able to get container with id {{/containers/\{container-id\}}} directly, instead of using existing hierarchical REST API. My goal is to make RM/ATSv2 have consistent REST API view. Thoughts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056626#comment-15056626 ] Li Lu commented on YARN-4224: - bq. Anyways from the GET side which is our immediate use case, if I understand, we will get a set of flows and send UID in the same response for later queries ? Yes, how about putting them as an "otherinfo" so that the front end can get this information? bq. If we have all the info, cluster, user, flow, etc. can't we create a URL of the form /cluster_id/fuser/flow_name ? Having hierarchical IDs are possible in ember, but in general it's not a common practice. On this point, maybe [~leftnoteasy] has comments? bq. Even if UID is required what should be the delimiter ? What if flow name has the same delimiter for instance. We need to handle it then. That's something we need to consider if we'd like to pursue this approach. We may need to restrict some special characters in our cluster id/user name/flow names. bq. If we need this format for UI, should we have this REST endpoint in addition to our current REST endpoints(based on proposals above) for normal flow from clients ? I'd prefer to have them as the only style of endpoints for timeline v2. Right now we need to spend some work to rebuild REST endpoints in this style for AHS for the new UI. Right now in ATS v2 we're starting out fresh, therefore we don't need to handle the legacy use cases? bq. Moreover, what do you mean by batch query ? Does that mean support for multiple optional query parameters like filters etc. to trim down the results ? We already have them. Yes. Let's make sure they have the same style as other endpoints (proposed in this JIRA) though. I don't think we need much work underneath the wrapper layer. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056200#comment-15056200 ] Varun Saxena commented on YARN-4224: So I looked at [~leftnoteasy]'s code at YARN-3368. I see that for single record like a single app attempt, we are extending urlForFindRecord and that takes only a single string id as input instead of an object as is the case with urlForQuery. In case of app attempt and containers, we can get both appid from app attempt id, and app attempt from container so a single id would do. In our case no such relationship exists between cluster, user, flow, etc. Is this why we need UID ? And we want to fetch it from server side so that UID encoding can be easily changed in future ? Is my understanding correct ? By the way what are implications of calling query instead of findRecord ? I guess multiple fields can be passed when we call urlForQuery. Moreover, what do you mean by batch query ? Does that mean support for multiple optional query parameters like filters etc. to trim down the results ? We already have them. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056143#comment-15056143 ] Varun Saxena commented on YARN-4224: IIUC, for PUT is the proposal to get UID first and then query again with this UID ? So we will have have a uid endpoint and then form the UID(as a delimited string) at server side. In what cases would we need a PUT from UI ? Anyways from the GET side which is our immediate use case, if I understand, we will get a set of flows and send UID in the same response for later queries ? So I will have endpoints like {{flowrun/flowrun_UID}}, {{entities/entities_UID}},etc. If we have all the info, cluster, user, flow, etc. can't we create a URL of the form {{/cluster_id/fuser/flow_name}} ? Are there any issues in Ember in creating such a URL ? Trying to understand why a flat URL structure is necessary for Ember. If I am not wrong we create the REST URL in JSONAPIAdapter in ember related code. Although I faced some issues in linking two pages when I worked on it(not sure if this had to do something with this), but anyways queries were being invoked with a hierarchical URL format(atleast 2 levels). Even if UID is required what should be the delimiter ? What if flow name has the same delimiter for instance. We need to handle it then. If we need this format for UI, should we have this REST endpoint in addition to our current REST endpoints(based on proposals above) for normal flow from clients ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15055229#comment-15055229 ] Li Lu commented on YARN-4224: - [~vrushalic] I think the proposal is pretty close to the hRaven form, just trying to combine the {{\{clustername\}/\{username\}}} part together with the flow's name to form an UID, so that some front end frameworks can easily access it. A problem here is the logic to build this UID would be better to be kept on the server side, that why we're proposing an "uid" end point for front end programs to build this UID when they would like to do a put (since they need to figure out the endpoint for this flow). [~Naganarasimha] I would not be surprised if most of the implementations of the proposed API are already in our codebase. We can simply change the "wrapper" layer of our reader server's WS to make this change. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15055071#comment-15055071 ] Naganarasimha G R commented on YARN-4224: - Catching up with the thread and went through the existing WebService interface Hi [~gtCarrera], May be if you give some examples then could relate to the existing APIs and what modifications are being suggested. As per the description given by you, i already see some API's being existed but may be with examples will get more clarity. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053926#comment-15053926 ] Vrushali C commented on YARN-4224: -- I am catching up on this thread. cc [~jrottinghuis] to be in the loop for the rest api URL format changes. [~gtCarrera] I think I understand the URL when it's only an app that we want since app ids are pretty distinct. But could you or someone give an example of say querying for a flow belonging to a particular user on a given cluster? For instance, I might run a pig script called "countUsers" and you also run the same pig script but under your user name. These are two different flows since, like you mentioned, we identify a flow as {cluster, user, flow name} tuple. The run id part can default to latest run. How would the URL look in the new proposal? hRaven URL for the above would look like http://hravenUrl.com/api/v1/flow/{clustername}/{username}/countUsers This would return the last run flow "countUsers" by user "{username}" on the cluster "{clustername}" > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053898#comment-15053898 ] Li Lu commented on YARN-4224: - Opened YARN-4445 to trace the renaming issue for flowId. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053896#comment-15053896 ] Li Lu commented on YARN-4224: - Oh and I forgot to mention one point. For each of the proposed end points, not providing an UID means querying them. Apps can be one example, flows can also be one example. One problem is that if there's no query parameter for those queries, there may be too many of them. So, we need to enforce some parameters to limit the number of entities returned from the calls. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053868#comment-15053868 ] Li Lu commented on YARN-4224: - I've got some offline discussions with [~leftnoteasy] and [~vinodkv]. In general, the challenges are from two sides: - For flows, flowruns, and timeline entities, their ids are a tuple, instead of a single "id". For example, a flow with name {{foo}} is actually identified through a tuple . - For ember or other front end frameworks, it's not encouraged to have a hierarchical object model. That is to say, the {{/cluster_id/user_id/flow_id}} way is not encouraged for the front end designs. Therefore, maybe a solution for this problem is to "flatten" the object model for the REST APIs. That is to say, on a flow endpoint /flow/flow_id, what we really want on the flow id part is a flattened tuple, or an "UID" of the flows for the REST APIs. In this way we can avoid introducing those super long URLs. For example, if we want to get the specific data of a flow (such as the list of flow runs) with id myWorkflow, we can GET on endpoint /flow/clusterId_userId_myWorkflow (underscores inside those ids can be encoded? ). Afterwards, the entity we're getting back contains flow information, as well as a list of flow runs within this flow. The tricky part is, we need to directly return the UID of those flow runs, so that the front end can directly make the next REST API calls. Therefore, under this hierarchy, front end users can directly find the UIDs for all elements that immediately below the current element's level (cluster->user->flow->flowrun). The rest part of the problem is putting new objects through this interface, where user only know the tuple but does not know the UID. We do not want to expose the logic to form the "UID" to the front end users. At least, we need to take control of this logic on our server side so that we can easily change that in the future. Therefore, we may want to provide another endpoint (like "uid") which translates parameters into a UID, so that front end users can post to the right place. In this way the UID is a pure front end concept. Anything in our backend can still use the existing context object model. We will not expose this concept to the backend storage. So the action item for now seems to be: 1. the concrete endpoints for this model: we need clusters, users, flows, flowruns, applications, and entities in our model. Each one of them should become a separate end point, and each one of them take an UID as identifier. Right now we can only support GET for the web UI, but in future we can PUT data there. 2. Batch query APIs for apps, as [~leftnoteasy] proposed. 3. I just noticed we're using the term "flowId" in our codebase, which is actually not an identifier of a flow. I'll open a JIRA to change it to flowName to avoid confusion with the UIDs of flows. Although this is not confusing us in our codebase, I suspect this may confuse our API users. Any comments folks? [~leftnoteasy], [~vinodkv] anything I'm missing from our discussion? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053671#comment-15053671 ] Varun Saxena commented on YARN-4224: Ok... Will wait for inputs from others as well then. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053663#comment-15053663 ] Li Lu commented on YARN-4224: - [~varun_saxena] I'm not 100% sure we've reached an agreement here. Would you mind to hold off a little bit? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053653#comment-15053653 ] Varun Saxena commented on YARN-4224: Looking at Wangda's patch at YARN-4417 and AHS REST endpoints, I think we can go with proposal 2. Will update a patch tomorrow. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053640#comment-15053640 ] Varun Saxena commented on YARN-4224: [~leftnoteasy], I guess you are going with proposal 2 above. All the path params mentioned above together uniquely identify the query. So we can simplify it further. Also we do use optional query parameters such as {{/ws/v2/timeline/apps?user=jimmy}}. I have not mentioned them above because they do not require change. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053643#comment-15053643 ] Varun Saxena commented on YARN-4224: Sorry I mean we cant simplify it further. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053603#comment-15053603 ] Wangda Tan commented on YARN-4224: -- Thanks [~varun_saxena] working on this patch, Took a quick look at the proposal, two major suggestions: *1) I'm not sure if it is possible to avoid using deep hierarchy.* If a model has unique id, we can simplify query from: {{/ws/v2/timeline/users/{userid}/clusters/{clusterid}/apps/{appid}/}} To {{/ws/v2/timeline/apps/{appid}}}. If a mode doesn't have unique id (assume entitiy-id isn't unique), we should add its must-to-have "parent" id to the patch, for example: {{/ws/v2/timeline/apps/{appid}}}/entities/{entities}}}. The reason of doing this is, deep hierarchy needs lots of context, for example, if we're using above REST API to query a container status, we need to know it's user/cluster-id/app-id. We shouldn't assume client know all of them. *2) Batch query API* I didn't find batch query API example from the proposal. How about use standard JSON-API-like format? Which will look like: {{/ws/v2/timeline/apps?user=jimmy}}. You can take a look at http://jsonapi.org/examples/ for more details. Thoughts? [~vinodkv], [~Naganarasimha]. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052865#comment-15052865 ] Varun Saxena commented on YARN-4224: Regarding user, when I had written the comment, we were getting user from caller UGI(if not provided by an optional query param) so at that time user was not part of URL's path param. Hence its not seen in the REST URL examples given above. User exists wherever required in the patch though because an interim patch had dropped getting user from UGI and moved it to path param. For the sake of discussion, I will write down the two approaches again. * The first approach is same as what I had mentioned above. {panel} * *Query flows* */ws/v2/timeline/\{clusterid}/flows* _Eg :_ /ws/v2/timeline/yarn_cluster/flows * *Query flowrun* */ws/v2/timeline/\{userid}/\{clusterid}/\{flowid}/run/\{flowrunid}* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/hive_flow/run/123 * *Query flowruns* */ws/v2/timeline/\{userid}/\{clusterid}/\{flowid}/runs* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/hive_flow/runs * *Query app* */ws/v2/timeline/\{clusterid}/app/\{appid}* _Eg :_ /ws/v2/timeline/yarn_cluster/app/application_11_1345 * *Query apps for a flow* */ws/v2/timeline/\{userid}/\{clusterid}/\{flowid}/apps* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/hive_flow/apps * *Query apps for a flowrun* */ws/v2/timeline/\{userid}/\{clusterid}/\{flowid}/\{flowrunid}/apps* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/hive_flow/123/apps * *Query entity* */ws/v2/timeline/\{userid}/\{clusterid}/\{appid}/\{entitytype}/entity/\{entityid}* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/application_1444034548255_0001/YARN_CONTAINER/entity/container_1444034548255_0001_01_01 * *Query entities* */ws/v2/timeline/\{userid}/\{clusterid}/\{appid}/\{entitytype}/entities* _Eg :_ /ws/v2/timeline/admin/yarn_cluster/application_1444034548255_0001/YARN_CONTAINER/entities {panel} * As Li said above, in AHS we have URLs' of the form as under. We can adopt this approach too. These URLs' would be longer though. {panel} * *Query flows* */ws/v2/timeline/clusters/\{clusterid}/flows* _Eg :_ /ws/v2/timeline/clusters/yarn_cluster/flows * *Query flowrun* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/flows/\{flowid}/runs/\{flowrunid}* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/flows/hive_flow/runs/123 * *Query flowruns* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/flows/\{flowid}/runs* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/flows/hive_flow/runs * *Query app* */ws/v2/timeline/clusters/\{clusterid}/apps/\{appid}* _Eg :_ /ws/v2/timeline/clusters/yarn_cluster/apps/application_11_1345 * *Query apps for a flow* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/flows/\{flowid}/apps* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/flows/hive_flow/apps * *Query apps for a flowrun* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/flows/\{flowid}/runs/\{flowrunid}/apps* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/flows/hive_flow/runs/123/apps * *Query entity* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/apps/\{appid}/entities/\{entitytype}/\{entityid}* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/apps/application_1444034548255_0001/entities/YARN_CONTAINER/container_1444034548255_0001_01_01 * *Query entities* */ws/v2/timeline/users/\{userid}/clusters/\{clusterid}/apps/\{appid}/entities/\{entitytype}* _Eg :_ /ws/v2/timeline/users/admin/clusters/yarn_cluster/apps/application_1444034548255_0001/entities/YARN_CONTAINER {panel} So if we want brevity we can go with first approach. If we want to have it same way as it already exists in AHS, we can go with second approach. This would mean URLs' will be similar to what users are currently using. Everyone can give their opinion on this one. Also I think we should put cluster before user in REST URL(currently user comes before cluster). > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052787#comment-15052787 ] Varun Saxena commented on YARN-4224: Sorry pressed add by mistake. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052785#comment-15052785 ] Varun Saxena commented on YARN-4224: ReWhen I had written the comment, we were getting user > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052038#comment-15052038 ] Li Lu commented on YARN-4224: - And, BTW, under the current resource model of ATS v2, maybe a full path to locate an entity can be like: {code} /clusters/{clusterid}/users/{userid}/flows/{flowid}/flowruns/{flowrunid}/apps/{appid}/entities/{entityid} {code} Any stages in between will return the info of a specific entity (cluster, user, flow, flowrun, app), or list the next level of resources of it. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052035#comment-15052035 ] Li Lu commented on YARN-4224: - Reformat: OK I've got a few references for the discussion. I looked at the WebHDFS REST APIs but the use case there is not quite similar to our use case here. The RM REST APIs mostly only have one mandatory parameter, such as {{/apps/\{appid\}/appattempt}}. AHS web services is probably the most similar use case here, so we can borrow much of its resource model. For multiple parameters we organize them as an ordered sequence, each one following their parameter names, such as {{/apps/\{appid\}/appattempts/\{appattemptid\}/containers/\{containerid\}}}. Any APIs that do not end on a parameter (such as {{/apps/\{appid\}/appattempts}}) is treated as a list. This appears to be the typical resource model in YARN. The MapReduce AMWebService is another example for this. Another thing is, for special queries like flowapps, we can add them as short cuts on the flow level, such as {{/cluster/\{clusterid\}/user/\{userid\}/flow/\{flowid\}/apps}}. Could somebody please remind me why we decide to remove user from the path? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052030#comment-15052030 ] Li Lu commented on YARN-4224: - OK I've got a few references for the discussion. I looked at the WebHDFS REST APIs but the use case there is not quite similar to our use case here. The RM REST APIs mostly only have one mandatory parameter, such as "/apps/{appid}/appattempt". AHS web services is probably the most similar use case here, so we can borrow much of its resource model. For multiple parameters we organize them as an ordered sequence, each one following their parameter names, such as "/apps/{appid}/appattempts/{appattemptid}/containers/{containerid}". Any APIs that do not end on a parameter (such as "/apps/{appid}/appattempts") is treated as a list. This appears to be the typical resource model in YARN. The MapReduce AMWebService is another example for this. Another thing is, for special queries like flowapps, we can add them as short cuts on the flow level, such as "/cluster/{clusterid}/user/{userid}/flow/{flowid}/apps". Could somebody please remind me why we decide to remove user from the path? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052022#comment-15052022 ] Vinod Kumar Vavilapalli commented on YARN-4224: --- We should model this around resources (as REST specifies) instead of around queries. Special purpose queries can be treated as shortcuts to existing resource hierarchy. Also, user is missing from the above set of examples. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051956#comment-15051956 ] Li Lu commented on YARN-4224: - I scanned through the patch. The patch itself looks fine, but I would like to check with the broader community about the patterns proposed in this JIRA. Since this is blocking the ongoing web UI work, we really want to have a relatively stable interface on this before we proceed changing the UI side. IIUC we're putting the mandatory parameters in a hierarchical order in the URL, and adding optional parameters as query parameters. This approach looks fine with me. For naming conventions, we really want to be consistent with the rest part of the codebase. [~wangda] any suggestions/comments here, given this is quite related with the next-gen UI? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972877#comment-14972877 ] Hadoop QA commented on YARN-4224: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 24s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 2s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 31s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 19s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 14s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 13s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 43s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 52s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 3m 32s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 42m 30s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768555/YARN-4224-YARN-2928.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 3c4e424 | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/9562/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/9562/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9562/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9562/console | This message was automatically generated. > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)