[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366950#comment-14366950 ] Junping Du commented on YARN-3039: -- Thanks [~zjshen] and [~sjlee0] for review! [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Fix For: YARN-2928 Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366558#comment-14366558 ] Sangjin Lee commented on YARN-3039: --- I took a look at patch v.8. LGTM. Thanks much for your work [~djp]! [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366332#comment-14366332 ] Zhijie Shen commented on YARN-3039: --- [~djp], thanks for updating the patch. It looks good to me overall. Some more minor comments about the patch. 1. No need to change the constructor and have newTimelineService flag. {{createTimelineClient(ApplicationId appId) {}} is the new one added for TS v2. appId shouldn't be null for TS v2. Otherwise the request won't reach the right app-level aggregator. We can based on this var to determine if working with new or old timeline service. {code} public static TimelineClient newInstance() { 56 return new TimelineClientImpl(); 57} 58 59@Public 60public static TimelineClient newInstance(ApplicationId appId) { 61 return new TimelineClientImpl(appId); 62} 63 64@Public 65public static TimelineClient newInstance(ApplicationId appId, 66boolean newTimelineService) { 67 return new TimelineClientImpl(appId, newTimelineService); {code} 2. New put method won't reach this method. It's only called by old put entity/domain operations. So no change is requited here {code} 618 @Private 619 @VisibleForTesting 620 public ClientResponse doPostingObject(Object object, String path) { {code} And change {code} 306 // TODO need to cleanup filter retry later. 307 //client.addFilter(retryFilter); {code} to {code} if (contextAppId == null) client.addFilter(retryFilter); {code} such that old put operation will still use the old retry logic, while the new one use the new retry logic. 3. Is the change in TestContainerLaunchRPC related to this jira? And TestRPC - TestNMRPC or keept TestRPC in common and create a TestNMRPC in server-common to test server-side protocol? 4. You can remove the \@Public \@Private annotation around the server api stuff like ResourceTracker, or mark them Private. 5. {{void removeKnownAggregator(ApplicationId appId)}} seems not to be necessarily in context, and we can call Context. getKnownAggregators.remove(appId) to remove the entry. It seems that other collections in context are working in the same way. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365437#comment-14365437 ] Junping Du commented on YARN-3039: -- Thanks [~sjlee0], [~zjshen] and [~jianhe] for review and comments! bq.TimelineClient.java, l.59-64: I would vote for simply removing the method Yes. Forgot to delete it after comment and test. bq.l.413: nit: space Fixed. bq.l.416: should we throw a RuntimeException? how about wrapping it in a YarnException? Sounds good. Updated. bq.yarn-default.xml,It still has 20 as the thread count default. Sorry. I fixed the other place but forget to update value here. Updated in v6 patch. bq. I’m not 100% certain that there is an issue, but I think we need to be careful about always removing registered/known aggregators when the app is done. Failure to do so would quickly ramp up the NM memory, and it would become unusable. Again, I think the current code will properly remove a finished app from registered/known aggregator maps. It's just a reminder to be careful about this. Agree. Thanks for reminding here. I think it should be fine for usual case (app finished smoothly) and we need to track this issue also in AM/aggregator failed over case. Will have a comment in YARN-3038. bq. Aggregator-NM is the server-side protocol. The related classes and .proto should be put in yarn-server-common, like ResourceTracker. Right. Thanks for reminding here. I was thinking aggregator belongs to application/user land. According to current layout, it belongs to server land now so I will get it updated in v6 patch (huge effort though). bq. I'm not sure if we need to change ApplicationMaster. Even currently, the client will retry on connection problem with the server. It is still necessary for now. Without a separate thread, the main thread of AM will get blocked there and AM-RM heartbeat cannot start which cause deadlock mentioned by [~Naganarasimha] earlier. A cleaner solution should let TimelineClient have non-blocking call (and event loop) which I already put on my TODO list (and comments in patch). Let's keep things there for now so end-to-end can work. bq. IMHO, for the client, we can set TimelineClient as the listener of AMRMClient by adding AMRMClient#registerTimelineAggregatorAddressListener(TimelineClient client). Therefore, the user just needs to make one additional call registerTimelineAggregatorAddressListener to combine AMRMClient and TimelineClient. Inside TimelineClientImpl, there's no need to wait in loop for the address update, and to override onAggregatorAddressUpdated. AMRMClientImpl call it when get the update from heartbeat response. I think loop for waiting address update or service available is still necessary because even address get updated which could still be a broken address and retry logic will be needed in this case. Another useful case that the retry logic can help is when service address is not being updated but service get recovered for some reason. May be the current way to register a callback to AMRMClient (or something else for NodeManager) and listen address updated is enough for now, and we can improve a bit later (together with non-blocking call in TimelineClient) with adding a wake up notification if heartbeat comes between sleep interval? bq. In TimelineClientImpl, it seems to be not necessary to add additional retry logic, as the client has the retry logic as ClientFilter yet. I am not sure if Jersey client support retry on different URI. From my quick glance, it sounds like the client will bind with a fixed URI. Please let me know if my understanding is not correct. bq. BTW, the last patch should no longer apply in TimelineClientImpl. It needs to be rebased. Thanks for notice. Sounds like we have security change in trunk, get it rebased in v6 version. bq. Here, we should not save the current app state in state-store, otherwise recovery logic will break - it's currently not expecting non-final state on recovery. Good comments. How about we do persistent only on aggregator info but leave anything else as whatever it has (like Init status). bq. I think RMAppAggregatorUpdateTransition should be invoked only on certain application states that make sense to handle. better not make every app state to handle this transition. that'll cause unnecessary writes. It only write once (in most cases) for app's whole life cycle. Does that sound too much for RMStateStore? bq. Given it may needs more treatment for state preserving, shall we spin off the related code change and defer it to a later jira? I think it shouldn't block our next milestone. Sounds like a good plan. I will try to separate it out from v6 patch. Will deliver v6 patch soon to reflect above comments. [Aggregator wireup] Implement ATS app-appgregator service discovery ---
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364010#comment-14364010 ] Zhijie Shen commented on YARN-3039: --- bq. that work can be covered in making the per-node aggregator/collector a standalone daemon. I'm okay with the plan. Some comments about the patch: 1. Aggregator-NM is the server-side protocol. The related classes and .proto should be put in yarn-server-common, like ResourceTracker. 2. I'm not sure if we need to change ApplicationMaster. Even currently, the client will retry on connection problem with the server. 3. IMHO, for the client, we can set TimelineClient as the listener of AMRMClient by adding AMRMClient#registerTimelineAggregatorAddressListener(TimelineClient client). Therefore, the user just needs to make one additional call registerTimelineAggregatorAddressListener to combine AMRMClient and TimelineClient. Inside TimelineClientImpl, there's no need to wait in loop for the address update, and to override onAggregatorAddressUpdated. AMRMClientImpl call it when get the update from heartbeat response. 4. In TimelineClientImpl, it seems to be not necessary to add additional retry logic, as the client has the retry logic as ClientFilter yet. BTW, the last patch should no longer apply in TimelineClientImpl. It needs to be rebased. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364238#comment-14364238 ] Zhijie Shen commented on YARN-3039: --- Thanks for your input, [~jianhe]! Given it may needs more treatment for state preserving, shall we spin off the related code change and defer it to a later jira? I think it shouldn't block out next milestone. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364233#comment-14364233 ] Jian He commented on YARN-3039: --- I think RMAppAggregatorUpdateTransition should be invoked only on certain application states that make sense to handle. better not make every app state to handle this transition. that'll cause unnecessary writes. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363969#comment-14363969 ] Sangjin Lee commented on YARN-3039: --- [~djp], I went over patch v.5 one more time, and it looks mostly good. I did have a few comments (as follows): - TimelineClient.java -- l.59-64: I would vote for simply removing the method -- l.413: nit: space -- l.416: should we throw a RuntimeException? how about wrapping it in a YarnException? - yarn-default.xml -- it still has 20 as the thread count default - I’m not 100% certain that there is an issue, but I think we need to be careful about always removing registered/known aggregators when the app is done. Failure to do so would quickly ramp up the NM memory, and it would become unusable. Again, I _think_ the current code will properly remove a finished app from registered/known aggregator maps. It's just a reminder to be careful about this. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364198#comment-14364198 ] Jian He commented on YARN-3039: --- Here, we should not save the current app state in state-store, otherwise recovery logic will break - it's currently not expecting non-final state on recovery. {code} // Save to RMStateStore ApplicationStateData appState = ApplicationStateData.newInstance( app.getSubmitTime(), app.getStartTime(),app.getUser(), app.getApplicationSubmissionContext(), app.getState(), app.getDiagnostics().toString(), app.getFinishTime(), app.getAggregatorAddr()); app.rmContext.getStateStore().updateApplicationState(appState); {code} [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363676#comment-14363676 ] Zhijie Shen commented on YARN-3039: --- Thanks for the feedback, Junping and Sangjin! It sounds reasonable to have aggregator as the client and NM as the server. However, my major point is not about which side is the server/client, but about the rpc call to trigger app-level aggregator initialization. In YARN-3210's refactoring, one goal is to make it as clean as possible in aux service, such that we can reuse the code most in the future, as we know will move on with non aux service deployment. So in this work, since we have involved the rpc call, I raise the thought of using the rpc call to start app-level aggregator instead of aux service lifecycle event handler. IMHO, it's necessary in the future when the aggregator no longer resides in the same process of NM, such that I can see the merit of uniforming the way to start app-level aggregator. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363706#comment-14363706 ] Junping Du commented on YARN-3039: -- Thanks [~zjshen] for clarification here! bq. It sounds reasonable to have aggregator as the client and NM as the server. However, my major point is not about which side is the server/client, but about the rpc call to trigger app-level aggregator initialization. +1. That's good point we all agree I think. bq. So in this work, since we have involved the rpc call, I raise the thought of using the rpc call to start app-level aggregator instead of aux service lifecycle event handler. IMHO, it's necessary in the future when the aggregator no longer resides in the same process of NM, such that I can see the merit of uniforming the way to start app-level aggregator. I like the idea here. However, The major work of this JIRA's patch now is on setting up a client/server protocol between aggregator and NM for service address registration and broadcast to AM and other NMs via RM. The work for starting of aggregator is covered by YARN-3030 (done for auxiliary service) and YARN-3033 (for standalone model). I would expect YARN-3033 could address the starting aggregator details given this patch is big enough. Thoughts? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363811#comment-14363811 ] Sangjin Lee commented on YARN-3039: --- [~zjshen], {quote} So in this work, since we have involved the rpc call, I raise the thought of using the rpc call to start app-level aggregator instead of aux service lifecycle event handler. IMHO, it's necessary in the future when the aggregator no longer resides in the same process of NM, such that I can see the merit of uniforming the way to start app-level aggregator. {quote} I completely agree that we need an RPC-based mechanism to create and shut down the app-level aggregator/collector. As [~djp] mentioned, that work can be covered in making the per-node aggregator/collector a standalone daemon. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361074#comment-14361074 ] Sangjin Lee commented on YARN-3039: --- I stand corrected [~djp]. For some strange reason I missed the null check in the while loop, which is why I mistakenly thought that every call would end up right in the Thread.sleep(). Thanks for the correction. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361267#comment-14361267 ] Sangjin Lee commented on YARN-3039: --- [~djp], a couple of quick comments (I'll follow up after reviewing the latest patch more carefully). {quote} We could have an annotation in class level which is default publicity and stability for each method. However, each method could have its own annotation to override the class one. In most cases, the class level annotation is more public and stable than individual methods which is first-class contract with end users or other components (or they will have concerns to use it). Take an example, if we need to add a new API which is not stable yet to a protocol class marked with stable, we shouldn't regression the whole class from stable to evolving or unstable, but we can mark the new method as unstable or evolving. Make sense? {quote} Yes, I get the reasoning for annotating individual methods. My concern is more about the *new classes*. Note that we're still evolving even the class names. This might be a fine point, but I feel we should annotate the *new classes* at least as unstable for now in addition to the method annotations. Thoughts? {quote} bq. RMAppImpl.java, Would this be backward compatible from the RM state store perspective? I don't think so. ApplicationDataProto is also be a protobuffer object, and new field for aggreagtorAddress is optional. {quote} So you mean it will be backward compatible, right? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361280#comment-14361280 ] Sangjin Lee commented on YARN-3039: --- [~zjshen], [~djp], regarding the idea about having IPC from NM to the per-app collector, I don't think that will work with a special container use case. The special container for the per-app collector will bind to a port for RPC that will not be determined until the time the collector binds to it. So it's basically a chicken-and-egg problem: NM doesn't know the RPC port for the per-all collector in the special container until ... the special containers tells it. This is not a problem with the current per-node collector container situation. Although it's a little roundabout, I don't see a fundamental problem with having the per-app collector (or the collection of them) sending its location to the NM once it's up. It's actually conceptually simpler, and it should work in all 3 modes (aux service, standalone per-node daemon, and special container). [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361353#comment-14361353 ] Junping Du commented on YARN-3039: -- bq. Yes, I get the reasoning for annotating individual methods. My concern is more about the new classes. Note that we're still evolving even the class names. This might be a fine point, but I feel we should annotate the new classes at least as unstable for now in addition to the method annotations. Thoughts? Agree. I think in v5 patch, I tried to mark all interfaces (include some abstract classes, we don't need to mark implementation because it follow the same with parent class/interface) with either Evolving or Unstable. Please let me know if I miss something there. bq. So you mean it will be backward compatible, right? Yes. I mean this. bq. NM doesn't know the RPC port for the per-all collector in the special container until ... the special containers tells it. This is not a problem with the current per-node collector container situation. Make sense. That's also a good reason to keep NM as RPC server and aggregator(collector)Collection as client. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357111#comment-14357111 ] Junping Du commented on YARN-3039: -- Thanks [~sjlee0]! Providing an end-to-end flow below which could be helpful for your review: - When AM get launched, NM auxiliary service will add a new aggregator service to aggregatorCollection (per Node) for necessary binding work. aggregatorCollection also has a client for AggregatorNodeManagerProtocol to notify NM on new app aggregator registered and detailed address. - When NM get notified, it will update registeredAggregators list (for all local app aggregators), and notify RM in next heartbeat. - RM received registeredAggregators from NM, it will update its aggregators list. - Next time, when other NMs and AM heartbeat with RM, it will provide aggregatorInfo in heartbeat response (for AM, it is through AllocationResponse). - AM of DS has AMRMClientAsync which heartbeat with RM so can receive updated aggregator address periodically. With registered a callback for listening aggregator address update, it can update address of TimelineClient in a thread-safe way. - AM call timeline operations in a non-blocking way (for not hanging there as deadlock), currently is wrapping with a new thread but will be improved later (in another JIRA) for saving resource of threads. - TimelineClient (consuming v2 service) is looping in retry logic until get correct address that being set by AM. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359957#comment-14359957 ] Junping Du commented on YARN-3039: -- Also, thanks [~zjshen] for the comments! bq. Sorry for raising the question in late, but I'd like to think it out loudly about the first step. No worry. Good idea never comes late. bq. Nowadays, app-level aggregator is started by the callback handler listening to the container start even of NM. Given we are going to support stand-alone and container mode, this approach may not work. As we're going to have IPC channel between aggregator and NM, should we use an IPC call to invoke adding one app-level aggregator. That will make NM as client and aggregatorCollection as server. I think in our design for multiple aggregator models (show in YARN-3033), the relationship between NM and aggregatorCollection in future could be 1 to n mapping. In this case, making NM as server could be better, or NM have to maintain different clients to talk to different aggregatorCollections. Also, it is more align with design of YARN-3332 which sounds like NM will have more information to share out as a server. [~vinodkv], please correct me if I miss something here. Making NM as server can still work for stand-alone and container mode, several ways I can think of now: 1. add a heartbeat between NM and aggregatorCollection: given it is IPC and number of aggregatorCollection is 1 or a small number, the heartbeat interval can be much less than 1 second. Also, richer info (like healthy status, etc.) could be taken through the request and response rather than just applicationID and service address. 2. if we don't want any delay, we can make NM holding on RPC response a while and return if a new AM launch request or reach to an interval, aggregator will send RPC request again immediately even no new aggregator get bind. Thoughts? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359936#comment-14359936 ] Junping Du commented on YARN-3039: -- [~sjlee0], Thanks for comments here! In my understanding, if timelineServiceAddress is not null, the pollTimelineServiceAddress() can be fast return as below. Isn't it? Do you think we still need a null check here? {code} private int pollTimelineServiceAddress(int retries) { while (timelineServiceAddress == null retries 0) { try { Thread.sleep(this.serviceRetryInterval); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } timelineServiceAddress = getTimelineServiceAddress(); retries--; } return retries; } {code} [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359441#comment-14359441 ] Zhijie Shen commented on YARN-3039: --- bq. When AM get launched, NM auxiliary service will add a new aggregator service to aggregatorCollection (per Node) for necessary binding work. aggregatorCollection also has a client for AggregatorNodeManagerProtocol to notify NM on new app aggregator registered and detailed address. Hi Junping, thanks for creating the new patch. Sorry for raising the question in late, but I'd like to think it out loudly about the first step. Nowadays, app-level aggregator is started by the callback handler listening to the container start even of NM. Given we are going to support stand-alone and container mode, this approach may not work. As we're going to have IPC channel between aggregator and NM, should we use an IPC call to invoke adding one app-level aggregator. So the protocol is that NM sends a request to the aggregator collection to start a app-level aggregator, and collection responds with the aggregator address. However, in this case, it may not be AggregatorNodemanagerProtocol, but NodemanagerAggregatorProtocol instead. The benefit is to uniform the way of starting app-level aggregator inside node-level aggregator (at lease it seem that we need to something similar in YARN-3033), and further reducing the dependency/assumption on aux service. [~vinodkv] and [~sjlee0] how do you think about it? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359163#comment-14359163 ] Junping Du commented on YARN-3039: -- Thanks [~sjlee0] for review and many good comments! I addressed most of them in v5 patch and will post it soon. bq. I see some methods are marked as Stable (e.g. AggregatorNodemanagerProtocol), but I feel that’s bit premature. Can we mark these still unstable or evolving? Note that at least at this point even the class names can change. Nice catch! Let's mark it with evolving instead. bq. While we’re on the API annotations, I notice that the annotations are on methods rather than classes themselves. I usually set them on the classes with the understanding that the entire class is unstable, for example. Which is a more common practice? We could have an annotation in class level which is default publicity and stability for each method. However, each method could have its own annotation to override the class one. In most cases, the class level annotation is more public and stable than individual methods which is first-class contract with end users or other components (or they will have concerns to use it). Take an example, if we need to add a new API which is not stable yet to a protocol class marked with stable, we shouldn't regression the whole class from stable to evolving or unstable, but we can mark the new method as unstable or evolving. Make sense? bq. I feel that the default for NM_AGGREGATOR_SERVICE_THREAD_COUNT doesn't have to be as high as 20 as the traffic will be pretty low; 5? 5 sounds good. Will update. bq. Since createTimelineClient(ApplicationId) was introduced only on this branch, we should be able to just replace it instead of adding a new deprecated method, no? I didn't check the history when we are adding createTimelineClient(ApplicationId). If so, we can remove it. For other createTimelineClient(), we can mark them as deprecated if it is showed up in previous releases because newInstance() is a more standard way of doing factory work. bq. putObjects() Not sure if I understand the comment “timelineServiceAddress couldn’t have been initialized”; can’t putObjects() be called in a steady state? If so, shouldn’t we check if timelineServiceAddress is not null before proceeding to loop and wait for the value to come in? Otherwise, we would introduce a 1 second latency in every put call even in a steady state? In steady state, there is no initialized delay becuase timelineServiceAddress is already there (in timelineClient). The cost here only happens for the first event when timelineClient start to post or after timelineServiceAddress get updated (for failure or other reasons). We design this to make sure TimelineClient can handle service discovery itself rather than letting caller to figure it out. bq. maxRetries - retries might be a better variable name? Updated. bq. It might be good to create a small helper method for polling for the timelineServiceAddress value You are right that we can always abstract some helper method to make logic more consisely. Update it with pollTimelineServiceAddress() method. bq. Not sure if we need a while loop for needRetry; it either succeeds (at which point needRetry becomes false and you exit normally) or it doesn’t (in which case we go into the exception handling and we try only once to get the value). Basically I’m not sure whether this retry code is what you meant to do? That's actually a bug here that try-catch should be inside while loop, so we can tolerant post failure for some retries times (the address could be stale and being updated by RM) within the while loop. Thanks for identifying this. bq. I think it may be enough to make timelineServiceAddress volatile instead of making getter/setter synchronized. Agree. Given only one thread could update today, so volatile should be safe enough. bq. doPostingObject() has duplicate code with putObjects(); can we consider ways to eliminate code duplication? I know it calls different methods deep inside the implementation, but there should be a way to reduce code duplication. Agree. In new patch (v5), I will abstract all common logic in some helper methods. bq. typo: ILLEAGAL_NUMBER_MESSAGE - ILLEGAL_NUMBER_MESSAGE, Context.java, We might want to update the comment (or method names) a little bit. NodeManager.java, removeRegisteredAggregators() - removeRegisteredAggregator() (should be singular) Updated for these comments. bq. We need removeKnownAggregator() as well; otherwise we’d blow the NM memory. Perhaps what we need is more like setKnownAggregators() instead of add? Essentially NM would need to replace its knowledge of the known aggregators every time RM updates it via heartbeat, right? I was thinking some shortcut for registeredAggegators goes to knownAggregators directly so some local aggregators don't
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359603#comment-14359603 ] Sangjin Lee commented on YARN-3039: --- [~djp], thanks for your super prompt reply and the update! I'm going to go over your reply and the new patch soon, but wanted to clarify one point. You said {quote} In steady state, there is no initialized delay becuase timelineServiceAddress is already there (in timelineClient). The cost here only happens for the first event when timelineClient start to post or after timelineServiceAddress get updated (for failure or other reasons). We design this to make sure TimelineClient can handle service discovery itself rather than letting caller to figure it out. {quote} I'm not quite sure if I understood this part. The code in question is this for example: {code} 368 @Private 369 public void putObjects(String path, MultivaluedMapString, String params, 370 Object obj) throws IOException, YarnException { 371 372 // timelineServiceAddress could haven't be initialized yet 373 // or stale (only for new timeline service) 374 int retries = pollTimelineServiceAddress(this.maxServiceRetries); 375 {code} The putObjects() method can be called in a steady state (i.e. long after the timeline service address is initialized), right? Then *don't we want to check if timelineServiceAddress is null* before proceeding to poll for it? Like (lines 372 and 376): {code} 368 @Private 369 public void putObjects(String path, MultivaluedMapString, String params, 370 Object obj) throws IOException, YarnException { 371 372 if (timelineServiceAddress == null) { 373 // timelineServiceAddress could haven't be initialized yet 374 // or stale (only for new timeline service) 375 int retries = pollTimelineServiceAddress(this.maxServiceRetries); 376 } 377 {code} Without that null check, invocations of putObjects() would *always* call pollTimelineServiceAddress() even if timelineServiceAddress is already set, right? My understanding is that we shouldn't even poll if timelineServiceAddress is already populated. It is possible for the value to have changed, but that's covered by the later check when you handle the exception of posting the entities. Did I miss something? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357038#comment-14357038 ] Sangjin Lee commented on YARN-3039: --- Thanks [~djp]! I'll take a look at it and add my comments. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358048#comment-14358048 ] Sangjin Lee commented on YARN-3039: --- Thanks [~djp] for the patch! I took a first pass, and have some comments. Sorry the comments are long, but the patch was pretty big. :) I agree with the most of the design as in the design doc as well as the implementation, but there are certain things that need your attention. Let me know what you think. I'll post more if I see anything else. I still need to look at aspects like thread safety a little more closely. - I see some methods are marked as Stable (e.g. AggregatorNodemanagerProtocol), but I feel that’s bit premature. Can we mark these still unstable or evolving? Note that at least at this point even the class names can change. - While we’re on the API annotations, I notice that the annotations are on methods rather than classes themselves. I usually set them on the classes with the understanding that the entire class is unstable, for example. Which is a more common practice? - YarnConfiguration.java -- I feel that the default for NM_AGGREGATOR_SERVICE_THREAD_COUNT doesn't have to be as high as 20 as the traffic will be pretty low; 5? - TimelineClient.java -- Since createTimelineClient(ApplicationId) was introduced only on this branch, we should be able to just replace it instead of adding a new deprecated method, no? - TImelineClientImpl.java -- putObjects() --- Not sure if I understand the comment “timelineServiceAddress couldn’t have been initialized”; can’t putObjects() be called in a steady state? If so, shouldn’t we check if timelineServiceAddress is not null before proceeding to loop and wait for the value to come in? Otherwise, we would introduce a 1 second latency in every put call even in a steady state? --- maxRetries - retries might be a better variable name? --- It might be good to create a small helper method for polling for the timelineServiceAddress value --- Not sure if we need a while loop for needRetry; it either succeeds (at which point needRetry becomes false and you exit normally) or it doesn’t (in which case we go into the exception handling and we try *only once* to get the value). Basically I’m not sure whether this retry code is what you meant to do? --- I think it may be enough to make timelineServiceAddress volatile instead of making getter/setter synchronized. -- doPostingObject() has duplicate code with putObjects(); can we consider ways to eliminate code duplication? I know it calls different methods deep inside the implementation, but there should be a way to reduce code duplication. - TestRPC.java -- typo: ILLEAGAL_NUMBER_MESSAGE - ILLEGAL_NUMBER_MESSAGE - Context.java -- We might want to update the comment (or method names) a little bit; initially I didn’t quite get the difference between the registered aggregators and known aggregators. - NodeManager.java -- removeRegisteredAggregators() - removeRegisteredAggregator() (should be singular) -- We need removeKnownAggregator() as well; otherwise we’d blow the NM memory. :) Perhaps what we need is more like setKnownAggregators() instead of add? Essentially NM would need to replace its knowledge of the known aggregators every time RM updates it via heartbeat, right? - NodeStatusUpdaterImpl.java -- I’m not sure if all the add/remove***Aggregators() methods are called at the right time. Especially, I’m not sure if we’re handling the case of removing an aggregator (e.g. app finished). When an app finishes and the per-app aggregator is shut down, NM needs to be notified, remove it from its registered aggregators, and let RM know that it’s gone so that it can be removed from the RM state store as well. How is this being handled? It’s not clear to me. At least I don’t see any calls for removeRegisteredAggregators(). - ResourceTrackerService.java -- (see above) Shouldn’t we handle a situation where the aggregator is removed? updateAppAggregatorsMap() seems to deal with adding/updating an aggregator only. -- Any race condition in updateAppAggregatorsMap() when one AM (app attempt) goes down and another AM (app attempt) comes up? - RMAppImpl.java -- Would this be backward compatible from the RM state store perspective? - TimelineAggregatorsCollection.java -- What value will be set as timelineRestServerBindAddress? It appears to be just the value of TIMELINE_SERVICE_BIND_HOST. That doesn't sound right to me, For each node, we need to get the fully qualified hostname of the node. It doesn’t look like that’s the case here? - Would it be a good idea to provide an RPC method to query for the updated aggregator information? Waiting for the heartbeats is OK, but if it fails, it might be easier/better to query (e.g. NM discovering the aggregator address). [Aggregator wireup] Implement ATS
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356179#comment-14356179 ] Junping Du commented on YARN-3039: -- Hi [~sjlee0], thanks for comments above. I think we are on the same page now. Please check v2 proposal attached. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351132#comment-14351132 ] Sangjin Lee commented on YARN-3039: --- Sorry for chiming in late. Some of the questions may have been addressed already, but I'll add my 2 cents. bq. The things could be different in your modes mentioned above is who and how to do the registration. I would prefer some other JIRA, like: YARN-3033, could address these differences. Thoughts? That sounds fine. {quote} We need RM to write some initiative app info standalone. However, do we expect RM to write all app-specific info or just in the beginning? We have a similar case in launching app's container - the first AM container get launched by RM, but following containers get launched by AM. Do we want to follow this pattern if we want to consolidate all app info with only one app aggregator? Didn't we want a singleton app aggregator for all app related events, logs, etc.? Ideally, only this singleton aggregator can have magic to sort out app info in aggregation. If not, we can even give up current flow NM(s) - app aggregator(deployed on one NM) - backend and let NM to talk to backend directly for saving hop for traffic. Can you clarify more on this? {quote} All the application lifecycle events (app state transitions) should be written *directly* by the RM. The main reason for that is at least the app-level aggregator may not even be up when the application lifecycle starts. So it seems pretty natural for the RM to be in charge of handling application lifecycle events and writing them directly (it would be even more awkward to have the RM write the first lifecycle event directly, and all subsequent ones through the app-level aggregator). The container lifecycle events (container state transitions) should be written by the respective NMs that handled the container state transitions through the right app-level aggregator. So, all the app-related writes *do* go through the app-level aggregator. The only exception is the RM directly writing the application lifecycle events. On the side, I'd like to note that as a rule all system events (YARN-generic events that pertain to lifecycles, etc.) must be written by YARN daemons (directly or through the app-level aggregator), and they should *not* be written by the AMs as we cannot rely upon them to write them. Does that clarify the point? [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347053#comment-14347053 ] Naganarasimha G R commented on YARN-3039: - Hi [~djp], bq. for security reason it make NM to take AMRMTokens for using TimelineClient in future which make less sense. To get rid of rack condition you mentioned above, we propose to use observer pattern to make TimelineClient can listen aggregator address update in AM or NM (wrap with retry logic to tolerant connection failure). Even if we are not able to have AMRMClient can be wrapped into TimelineClient i feel other suggestion from vinod was right {{to add a blocking call in AMRMClient to get aggregator address directly from RM.}} instead of observer pattern @ the AM side. thoughts ? bq. There are other ways (check from diagram in YARN-3033) that app aggregators could be deployed in a separate process or an independent container which make less sense to have a protocol between AUX service and RM. I think now we should plan to add a protocol between aggregator and NM, and then notify RM through NM-RM heartbeat on registering/rebind for aggregator. Yes i have gone through 3033, but earlier was trying to mention as our current approach was with NM AUX service. But anyway what i wanted was some kind of protocol between appAggregators with either NM or RM. Protocol between NM and appAgregator should suffice all other ways to launch AppAgregators. bq. app aggregator should have logic to consolidate all messages (events and metrics) for one application into more complex and flexible new data model. If each NM do aggregation separately, then it still a writer (like old timeline service), but not an aggregator Well if there is no logic/requirement to aggregate/consolidate all messages (events and metrics) for an App, then in my opinion it better not to have additional instances of aggregators and we can keep it similar to old Timline service. bq. Will update proposal to reflect all these discussions (JIRA's and offline). Thanks it will be more clear to implement if we have the proposals documented. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347384#comment-14347384 ] Junping Du commented on YARN-3039: -- Thanks [~Naganarasimha] for comments! bq. Even if we are not able to have AMRMClient can be wrapped into TimelineClient i feel other suggestion from vinod was right to add a blocking call in AMRMClient to get aggregator address directly from RM. instead of observer pattern @ the AM side. thoughts? I am open for this way. However, more to treat this as an optimization (don't have to wait heartbeat interval). Within this JIRA scope, I think we should have heartbeat in ApplicationMasterService as basic mechanism because some applications (like MR) doesn't use AMRMClient for now. We can have separated JIRA to address this optimization if necessary. BTW, what's your concern for observer (listener) pattern in AM? bq. Yes i have gone through 3033, but earlier was trying to mention as our current approach was with NM AUX service. But anyway what i wanted was some kind of protocol between appAggregators with either NM or RM. Protocol between NM and appAgregator should suffice all other ways to launch AppAgregators. Yes. Agree that not too much differences for aggregator talk to NM or RM. Just as demo patch shows, I would prefer slightly for NM because it seems RM already host too many RPC services today. bq. Well if there is no logic/requirement to aggregate/consolidate all messages (events and metrics) for an App, then in my opinion it better not to have additional instances of aggregators and we can keep it similar to old Timeline service. I am not sure on this but assume this is one part of motivation that we need new TimelineService (not only for performance reasons)? bq. Thanks it will be more clear to implement if we have the proposals documented. No problem. I will upload a new one when figuring out the demo patch which force me to address more details. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344914#comment-14344914 ] Junping Du commented on YARN-3039: -- Thanks for comments, [~Naganarasimha]! bq. +1 for this approach. Also if NM uses this new blocking call in AMRMClient to get aggregator address then there might not be any race conditions for posting AM container's life cycle events by NM immediately after creation of appAggregator through Aux service. Discussed with [~vinodkv] and [~zjshen] on this again offline. It looks heavy weight to make TimelineClient to wrap AMRMClient especially for security reason it make NM to take AMRMTokens for using TimelineClient in future which make less sense. To get rid of rack condition you mentioned above, we propose to use observer pattern to make TimelineClient can listen aggregator address update in AM or NM (wrap with retry logic to tolerant connection failure). bq. Are we just adding a method to get the aggregator address aggregator address ? or what other API's are planned ? Per above comments, we have no plan to add API to TimelineClient to talk to RM directly. bq. I beleive the idea of using AUX service was to to decouple NM and Timeline service. If NM will notify RM about new appAggregator creation (based on AUX service) then basically NM should be aware of PerNodeAggregatorServer is configured as AUX service, and and if it supports rebinding appAggregator for failure then it should be able to communicate with this Auxservice too, whether would this be clean approach? I agree we want to decouple things here. However, AUX service is not the only way to deploy app aggregators. There are other ways (check from diagram in YARN-3033) that app aggregators could be deployed in a separate process or an independent container which make less sense to have a protocol between AUX service and RM. I think now we should plan to add a protocol between aggregator and NM, and then notify RM through NM-RM heartbeat on registering/rebind for aggregator. bq. I also feel we need to support to start per app aggregator only if app requests for it (Zhijie also had mentioned abt this). If not we can make use of one default aggregator for all these kind of apps launched in NM, which is just used to post container entities from different NM's for these apps. My 2 cents here is app aggregator should have logic to consolidate all messages (events and metrics) for one application into more complex and flexible new data model. If each NM do aggregation separately, then it still a *writer* (like old timeline service), but not an *aggregator*. Thoughts? bq. Any discussions happened wrt RM having its own Aggregator ? I feel it would be better for RM to have it as it need not depend on any NM's to post any entities. Agree. I think we are on the same page now. Will update proposal to reflect all these discussions (JIRA's and offline). [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338687#comment-14338687 ] Naganarasimha G R commented on YARN-3039: - Hi [~djp] bq. Another idea (from Vinod in offline discussion) is to add a blocking call in AMRMClient to get aggregator address directly from RM +1 for this approach. Also if NM uses this new blocking call in AMRMClient to get aggregator address then there might not be any race conditions for posting AM container's life cycle events by NM immediately after creation of appAggregator through Aux service. bq. In addition, if adding a new API in AMRMClient can be accepted, NM will use TimelineClient too so can handle service discovery automatically. Are we just adding a method to get the aggregator address aggregator address ? or what other API's are planned ? bq. NM will notify RM that this new appAggregator is ready for use in next heartbeat to RM (missing in this patch). bq. RM verify the out of service for this app aggregator first and kick off rebind appAggregator to another NM's perNodeAggregatorService in next heartbeat comes. I beleive the idea of using AUX service was to to decouple NM and Timeline service. If NM will notify RM about new appAggregator creation (based on AUX service) then basically NM should be aware of PerNodeAggregatorServer is configured as AUX service, and and if it supports rebinding appAggregator for failure then it should be able to communicate with this Auxservice too, whether would this be clean approach? I also feel we need to support to start per app aggregator only if app requests for it (Zhijie also had mentioned abt this). If not we can make use of one default aggregator for all these kind of apps launched in NM, which is just used to post container entities from different NM's for these apps. Any discussions happened wrt RM having its own Aggregator ? I feel it would be better for RM to have it as it need not depend on any NM's to post any entities [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)