[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369779#comment-15369779 ] Hudson commented on YARN-3431: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-3431. Sub resources of timeline entity needs to be passed to a (sjlee: rev 2bdefbc4a070df2932a66e580d70239c132299d2) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/ClusterEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/HierarchicalTimelineEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineUser.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/ApplicationAttemptEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorWebService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/FlowEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/ContainerEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/UserEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/TestTimelineServiceClientIntegration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/ApplicationEntity.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/timelineservice/TestTimelineServiceRecords.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/QueueEntity.java > Sub resources of timeline entity needs to be passed to a separate endpoint. > --- > > Key: YARN-3431 > URL: https://issues.apache.org/jira/browse/YARN-3431 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Fix For: YARN-2928 > > Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, > YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch, YARN-3431.7.patch > > > We have TimelineEntity and some other entities as subclass that inherit from > it. However, we only have a single endpoint, which consume TimelineEntity > rather than sub-classes and this endpoint will check the incoming request > body contains exactly TimelineEntity object. However, the json data which is > serialized from sub-class object seems not to be treated as an TimelineEntity > object, and won't be deserialized into the corresponding sub-class object > which cause deserialization failure as some discussions in YARN-3334 : > https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513930#comment-14513930 ] Junping Du commented on YARN-3431: -- Just thought it again, this should go first given many review efforts already there. However, sounds like latest patch cannot be applied on cleanly with YARN-3390 just checked in. [~zjshen], would you update the patch? Thx! Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514431#comment-14514431 ] Zhijie Shen commented on YARN-3431: --- Junping, thanks! I've rebased the patch. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch, YARN-3431.7.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511104#comment-14511104 ] Junping Du commented on YARN-3431: -- bq. I'm still not sure if it's good idea to expose two TimelineUtils to users. I already gave my points why we cannot merge two TimelineUtils (for dependency issues). It is very nature that if we want to abstract some methods in yarn-api for timeline service, we can put the one in yarn-api; similar use case for yarn-common one. What's your concern to have Utility class in different component/project? bq. And this jira shouldn't depend on YARN-3276, right? Yes. But this patch could make YARN-3276 get rebased again which has pending for a long time. I gave the comments long ago when reviewing YARN-3087 (https://issues.apache.org/jira/browse/YARN-3087?focusedCommentId=14339015page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14339015), but that comments get ignored. So instead, I filed a JIRA to fix this in YARN-3276. I strongly like YARN-3276 to go first. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509774#comment-14509774 ] Li Lu commented on YARN-3431: - Hi [~zjshen], thanks for the update! The latest patch LGTM. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508529#comment-14508529 ] Zhijie Shen commented on YARN-3431: --- bq. It would be a little more consistent and perform slightly better if the type check in getChildren() is consolidated into validateChildren(). Refactored the code, such that we don't iterate the set twice. bq. maybe we'd like to add some prefix to the fields we (implicitly) add to the info field of an entity? I changed the info keys a bit to make them start with SYSTEM_INFO_. Hopefully it will reduce the conflict. Anyway, we need to identify the system info keys in the documentation to notify users of not using them. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510143#comment-14510143 ] Junping Du commented on YARN-3431: -- Thanks [~zjshen] for updating the patch! Sorry that my comments come a little late. Patch looks good in overall. However, I still cannot be used to the casting on null object and duplicated code for casting map to hashmap. Can we make YARN-3276 to go first which I just tried that the v2 patch still valid? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch, YARN-3431.6.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505334#comment-14505334 ] Sangjin Lee commented on YARN-3431: --- It looks good to me. One small suggestion (it's not critical but would be nicer): It would be a little more consistent and perform slightly better if the type check in getChildren() is consolidated into validateChildren(). In validateChildren() we iterate over the set anyway, and we could do the type check as part of validating it. What do you think? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506003#comment-14506003 ] Li Lu commented on YARN-3431: - Hi [~zjshen], the latest patch LGTM. One quick question: maybe we'd like to add some prefix to the fields we (implicitly) add to the info field of an entity? In this way we can further reduce the chance for user defined info fields to conflict with our implicitly added fields. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504394#comment-14504394 ] Zhijie Shen commented on YARN-3431: --- [~sjlee0], thanks for your comments. I addressed them in the new patch. For the last one. I changed to store the list of identifier directly into the info value. As identifier is already annotated, we don't need to take care of ser/des. Jackson will marshal/unmarshal into and from json object. I've verified it locally. Also I added some more tests to verify the parent/children APIs. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch, YARN-3431.5.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503425#comment-14503425 ] Sangjin Lee commented on YARN-3431: --- Thanks for the update [~zjshen]! I quickly went over it, and have some comments. (FlowEntity.java) - l.58-60: setId() is done twice? - l. 94: why create a new Long()? (HierarchicalTimelineEntity.java) - l.24: I think this was pointed out previously, but I understand the Hadoop coding convention is not to use wildcard imports - l.43: wondering out loud, is there an issue in storing the Identifier directly as opposed to a concenated string? At least, I think we could provide Identifier.toString() and Identifier.fromString() (static) to handle this more tersely Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, YARN-3431.4.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499314#comment-14499314 ] Zhijie Shen commented on YARN-3431: --- Right, TimelineEntity is the generic Java form for us to compose a timeline entity in java code, while its corresponding JSON object is the payload during REST communication. Subclasses of TimelineEntity are defined to facilitate us/users to easily manipulate some predefined, specific attributes. bq. My main problem is with the prototype field of TimelineEntity. Maybe I should change prototype to real. After receiving the entity from the endpoint of the web server, not matter it was the generic TimelineEntity or the subclass object, it will be deserialized as TimelineEntity object. If it was the subclass object, the content is preserved, but the Java class hierarchy is lost after deserialization. However, we can use TimelineEntity and its type to construct the right subclass object in a *proxy* way. bq. For HierarchicalTimelineEntity, seems like we're not adding any special tags when we addIsRelatedToEntity() in setParent() Yeah, relates to/ is related to is used to construct a directed graph among entities. Parent-child relationship is a tree, which can be described by relates to/ is related. bq. Are we prohibiting the users from using isRelatedToEntities in HierarchicalTimelineEntity completely to avoid problems? Sounds good. I used to think about it, but not include it in this patch. bq. , I'm not sure if we really need the subclass information. I'm not pretty sure, but I guess we may probably not need the subclasses' Java APIs, and that's why I put a comment there. However, since it's not a big overhead given the way we construct the subclass object, I prefer to leave the code there, in case we want subclass APIs somewhere (e.g., aggregation). There're two additional bugs in this patch. I'll fix the outstanding issues and upload a new one later. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499413#comment-14499413 ] Li Lu commented on YARN-3431: - bq. After receiving the entity from the endpoint of the web server, not matter it was the generic TimelineEntity or the subclass object, it will be deserialized as TimelineEntity object. If it was the subclass object, the content is preserved, but the Java class hierarchy is lost after deserialization. However, we can use TimelineEntity and its type to construct the right subclass object in a proxy way. OK, I agree the current design would save us one deep copy every time we receive a timeline entity. I'm still thinking about an appropriate name for the prototype field to better represent its nature... bq. For HierarchicalTimelineEntity, seems like we're not adding any special tags when we addIsRelatedToEntity() in setParent() bq. Yeah, relates to/ is related to is used to construct a directed graph among entities. Parent-child relationship is a tree, which can be described by relates to/ is related. bq. Are we prohibiting the users from using isRelatedToEntities in HierarchicalTimelineEntity completely to avoid problems? bq. Sounds good. I used to think about it, but not include it in this patch. That sounds good. It would be very helpful to explicitly prohibit direct usages of isRelatedToEntities and relatesToEntities IMHO. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500398#comment-14500398 ] Sangjin Lee commented on YARN-3431: --- I know [~zjshen]'s updating the patch, but I'll provide some feedback based on the current patch and the discussion here. Generally I agree with the approach of using fields in TimelineEntity to store/retrieve specialized information. That would definitely help with the JSON's (lack of) support for polymorphism. With regards to parent-child relationship and the relationship in general, this might be some change, but would it be better to have some kind of a key or a label for a relationship? It would help locate the particular relationship (e.g. parent) quickly, and help other use cases in identifying exactly the relationship it needs to retrieve. Thoughts? On a related note, I have problems with prohibiting hierarchical timeline entities from having any other relationships than parent-child. For example, frameworks (e.g. mapreduce) may use hierarchical timeline entities to describe their hierarchy (job = task = task attempts), and these entities would have dotted lines to YARN system entities (app, containers, etc.) and vice versa. It would be a pretty severe restriction to prohibit them. If we adopt the above approach, we should be able to allow both, right? (FlowEntity.java) - l. 58: do we want to set the id once we calculate it from scratch? (TimelineEntity.java) - l.88: Some javadoc would be helpful in explaining this constructor. It doesn't come through as very obvious. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500672#comment-14500672 ] Zhijie Shen commented on YARN-3431: --- [~sjlee0], how about we doing this? Instead of using relates_to/is_related_to to store the parent-child relationship, we put them into info section. Then, we can search for this parent-child relationship quickly, and we don't disturb the normal usage of relates_to/is_related_to. Does it sound good? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498169#comment-14498169 ] Junping Du commented on YARN-3431: -- Thanks [~zjshen] for delivering a updated patch to fix it! The solution here looks good to me in general. Some comments: In HierarchicalTimelineEntity.java, {code} + public ClusterEntity(TimelineEntity entity) { +super(entity); +if (!entity.getType().equals(TimelineEntityType.YARN_CLUSTER.toString())) { + throw new IllegalArgumentException(Incompatible entity type: + getId()); +} + } {code} Sounds like a serious bug here: we have subclass of HierarchicalTimelineEntity, it will call the type check in sub class and parent class when doing construction. There is always exception get thrown here. We should find some way to figure it out. e.g. adding a boolean value of check type? {code} + public TimelineEntity(TimelineEntity entity) { +prototype = entity.getPrototype(); + } ... + protected TimelineEntity getPrototype() { +return prototype == null ? this : prototype; + } {code} I think we have a prototype TimelineEntity here to create a TimelineEntity object from a subclass object of TimelineEnity. Isn't it? If so, I don't understand what benefit we gain comparing with type casting directly. Am I missing something here? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498321#comment-14498321 ] Zhijie Shen commented on YARN-3431: --- bq. There is always exception get thrown here. We should find some way to figure it out. e.g. adding a boolean value of check type? It won't always throw exception. In this example, the exception will be thrown only when you use a TimelineEntity object whose type is not YARN_CLUSTER to construct a ClusterEntity, because this is logically invalid construction. bq. If so, I don't understand what benefit we gain comparing with type casting directly. Am I missing something here? At the web service side, we will only receive the generic TimelineEntity object, but not its subclass object. We can't do casting but use the generic object to construct the subclass object again. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499002#comment-14499002 ] Li Lu commented on YARN-3431: - Hi [~zjshen], thanks for the work! I reviewed your v3 patch. A general comment about the idea, if I understand correctly: Now we're requiring subclasses to decide a strategy to encapsulate extended information into fields of TimelineEntity. We will use TimelineEntity as standard object type for web services and storage. This introduces conversion problems for subclasses. We need to provide some logic to rebuild subclasses based on their entity types. In this patch the rebuild process is implemented as TimelineCollectorWebService.processTimelineEntities. I have some questions: # My main problem is with the {{prototype}} field of TimelineEntity. Firstly, the name is a little bit awkward to me. It gives me an illusion that a class has a prototype of exactly the same type, which is a little bit weird to me. Secondly, since all extended information will be stored in TimelineEntity, the only thing different between subclass instances will be in their type fields. If so, do we still need to have a separate prototype section for web services? Thirdly, I searched the whole patch and seems like the only place to write to this prototype field is in the constructor of TimelineEntity, where it simply stores the incoming entity's prototype. I'm a little bit confused on this field overall. # For {{HierarchicalTimelineEntity}}, seems like we're not adding any special tags when we {{addIsRelatedToEntity()}} in {{setParent()}}. We're also requiring the keySet of isRelatedToEntities only have one key. Are we prohibiting the users from using {{isRelatedToEntities}} in {{HierarchicalTimelineEntity}} completely to avoid problems? # Now {{processTimelineEntities}} is called in {{TimelineCollectorWebService}}, in {{putEntities}}. From a storage layer perspective, I'm not sure if we really need the subclass information. We definitely need the logic of {{processTimelineEntities}} in the reader side, and maybe in our timeline collector implementation. # There are two .* imports in this patch, one in TestTimelineServiceClientIntegration and the other in TimelineCollectorWebService. Maybe we'd like to list them explicitly? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488001#comment-14488001 ] Li Lu commented on YARN-3431: - Hi [~zjshen], I checked your proposal and in general it LGTM. I have some minor concerns, however: # In general we're using v1 object model for data transferring and storage. Rebuilding the special info for subclasses may be challenging, as the special keys may be mixed with user defined keys. Even though the chance is low, we may want to find a more elegant solution on this. # How do the sub-class instances identify their own types? I think this is the core challenge here. Are we using duck typing here? That said, maybe we want to have a new data transfer type, that can accommodate the extra data in subclasses in extension fields, and can self-identify its type? I'm just thinking out loud here... Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486863#comment-14486863 ] Zhijie Shen commented on YARN-3431: --- I uploaded a patch to resolve the problem in the other way. I think of the sub classes again, and find it is not necessary to have the corresponding web service resources for them. In fact, there're two levels: 1. Java API Level: we want to have the sub-classes of TimelineEntity as the first class citizen, which can facilitate users to operate on the predefined entities. They may have special setters/getters. 2. REST API Level: JSON schema isn't polymorphic, such that we should have one schema that is generic enough to describe different kinds of entities. Fortunately, the entity schema is able to do that. The sub-classes of TimelineEntity contain the following additional information: a) Special attributes: they can be put into the info map of the entity, and treated as the predefined info. For example, queue of application entity can be put into info with key=QUEUE_INFO_KEY and value = some queue name. b) Parent-child relationship: they can be put into the relate/is_related_to relationship map of the entity. The relate/is_related_to relationship can describe an arbitrary directed graph, and tree is one type of directed graphs. In the new patch, I fixed the API records instead of the endpoint. Therefore, we will still have a single endpoint to accept entities, while Java APIs keep unchanged too. In terms of JSON content for communication, we will always use the generic entity schema for TimelineEntity and all kinds of its sub-classes. BTW, I fixed some minor issue together in this patch, such as renaming UserEntity and QueueEntity, and FlowEntity attributes. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483432#comment-14483432 ] Junping Du commented on YARN-3431: -- Thanks [~zjshen] for the patch and [~gtCarrera9] for review and comments. bq. However, I'm a little bit confused about the big picture of this patch. I put some contents and background in JIRA description. Hope it helps. {code} -putObjects(entities, params, entitiesContainer); +for (org.apache.hadoop.yarn.api.records.timelineservice.TimelineEntity entity : entities) { + String path = entities; + try { +path += / + TimelineEntityType.valueOf(entity.getType()).toString(); + } catch (IllegalArgumentException e) { +// Do nothing, generic entity type + } + putObjects(path, params, entity); +} {code} Looks like we are breaking one put operation into pieces. This doesn't make sense in performance prospective. Do we have to do this? BTW, we should handle IllegalArgumentException instead of ignoring it. Isn't it? Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch We have TimelineEntity and some other entities as subclass that inherit from it. However, we only have a single endpoint, which consume TimelineEntity rather than sub-classes and this endpoint will check the incoming request body contains exactly TimelineEntity object. However, the json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding sub-class object which cause deserialization failure as some discussions in YARN-3334 : https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482266#comment-14482266 ] Li Lu commented on YARN-3431: - Hi [~zjshen], thanks for working on this! I reviewed your v2 patch, and the code LGTM. However, I'm a little bit confused about the big picture of this patch. In this patch you're setting up separate REST endpoints to post different types of timeline entities. However, all different REST endpoints have exactly the same internal logic, redirecting the incoming entity to the collector's putEntity. Are those endpoints just placeholders so that we can specialize each of them? Or else, I'm not sure about the motivation behind this (currently no description for this JIRA...). Could you please elaborate a little bit more on this? BTW, I agree we need to specialize for different types of timeline entities, but maybe we need to do this on the collector/storage side? For storage layer design we need to write down the detailed timeline entities so specialization would be helpful. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.
[ https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482328#comment-14482328 ] Zhijie Shen commented on YARN-3431: --- bq. Are those endpoints just placeholders so that we can specialize each of them? Or else, I'm not sure about the motivation behind this (currently no description for this JIRA...). Could you please elaborate a little bit more on this? The problem is that we have TimelineEntity and the all the sub classes of it. On the other side, we have a single endpoint, which consume TimelineEntity. Therefore, this endpoint will check the incoming request body contains exactly TimelineEntity object. The json data which is serialized from sub-class object seems not to be treated as an TimelineEntity object, and won't be deserialized into the corresponding Sub-class object. I tried to figure out if JAX-RS has the general approach, but didn't have the answer (please let me know if anyone has the idea). Alternatively, I choose treat the predefined sub classes as the sub resources, and put them on separate endpoints. Once deserialized at the server side, java can identify TimelineEntity objects' classes and then treat them accordingly. So we don't need separate Java APIs in the collector. Sub resources of timeline entity needs to be passed to a separate endpoint. --- Key: YARN-3431 URL: https://issues.apache.org/jira/browse/YARN-3431 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-3431.1.patch, YARN-3431.2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)