subject:"\[jira\] \[Commented\] \(YARN\-3039\) \[Aggregator wireup\] Implement ATS writer service discovery"

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-25 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336707#comment-14336707
]

Junping Du commented on YARN-3039:
--

Thanks [~Naganarasimha] and [~rkanter] for review and comments!

bq. I feel AM should be informed of AggregatorAddr as early as register itself
than currently being done in ApplicationMasterService.allocate().
That's a good point. Another idea (from Vinod in offline discussion) is to add
a blocking call in AMRMClient to get aggregator address directly from RM.
AMRMClient can be wrapped into TimelineClient so no aggregator address or
aggregator failure can be handled transparently. Thoughts?

bq. For NM's too, would it be better to update during registering itself (may
be recovered during recovery, not sure though) thoughts?
I think NM case is slightly different here: NM need this knowledge whenever the
first container of this app get allocated/launched, so get things updated in
heartbeat sounds good enough. Isn't it? In addition, if adding a new API in
AMRMClient can be accepted, NM will use TimelineClient too so can handle
service discovery automatically.

bq. Was not clear about source of RMAppEventType.AGGREGATOR_UPDATE. Based on
YARN-3030 (Aggregators collection through NM's Aux service),
PerNodeAggregatorServer(Aux service) launches AppLevelAggregatorService, so
will AppLevelAggregatorService inform RM about the aggregator for the
application? and then RM will inform NM about the appAggregatorAddr as part of
heart beat response ? if this is the flow will there be chances of race
condition where in before NM gets appAggregatorAddr from RM, NM might require
to post some AM container Entities/events?
I think we can discuss this flow in two scenarios, the first time launch of app
aggregator and app aggregator failed over on another NM:
For the first time launch of app aggregator, NM aux service will bind the app
aggregator to perNodeAggregator when AM container get allocated (per
YARN-3030). NM will notify RM that this new appAggregator is ready for use in
next heartbeat to RM (missing in this patch). After received this messsage from
NM, RM with update its aggregator list and send
RMAppEventType.AGGREGATOR_UPDATE to trigger persistent of aggregator list
updating in RMStateStore (for RM failed over).
For app aggregator get failed over, AM or NMs (who called putEntities with
timelineClient) will notify RM on this failure, RM verify the out of service
for this app aggregator first and kick off rebind appAggregator to another NM's
perNodeAggregatorService in next heartbeat comes. When hear back from this new
NM, RM did the same thing as the 1st case.
One gap here today is we launched appAggregatorService (by NM's auxiliary
service) whenever AM container get launched, no matter first time launch or
rescheduled as failed before. As my early comments above - AM container failed
over with rescheduled to other NM may not have to cause rebind of aggregator
service just like out of service for app's aggregator may not cause AM
container get killed. So I think appAggregatorService should get launched by NM
automatic only in first attemp and taken care by RM in next attempts.
About rack condition between NM heartbeat with posting entities, I don't think
posting entities should block any major logic especially NM heartbeat. In
addition, if we make TimelineClient can handle service discovery automatically,
this will never happen. What do you think?

bq. Sorry for not commenting earlier. Thanks for taking this up Junping Du.
No worry. Thanks!

bq. Not using YARN-913 is fine if it's not going to make sense. I haven't
looked too closely at it either; it just sounded like it might be helpful here.
Agree. My feeling now is service discovery get couple tightly with service
lifecycle management. Given our app aggregator service - not inside of a
dedicated container, but have many options, and its consumer include YARN
components but not only AM. So I think YARN-913 may not be the best fit at this
moment.
[~ste...@apache.org] is the main author of YARN-913. Steve, do you have any
comments here?

bq. Given that a particular NM is only interested in the Applications that are
running on it, is there some way to have it only receive the aggregator info
for those apps? This would decrease the amount of throw away data that gets
sent.
In current patch, RM only send NM the aggregator lists for active Apps on this
container. Please check the code in ResourceTrackerService:
{code}
+ConcurrentMapApplicationId, String liveAppAggregatorsMap = new
+ConcurrentHashMapApplicationId, String();
+ListApplicationId keepAliveApps =
remoteNodeStatus.getKeepAliveApplications();
+if (keepAliveApps != null) {
+ ConcurrentMapApplicationId, RMApp rmApps = rmContext.getRMApps();
+ for (ApplicationId appId : keepAliveApps) {
+

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-25 Thread Robert Kanter (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336219#comment-14336219
]

Robert Kanter commented on YARN-3039:
-

Sorry for not commenting earlier. Thanks for taking this up [~djp].

Not using YARN-913 is fine if it's not going to make sense. I haven't looked
too closely at it either; it just sounded like it might be helpful here.

One comment on the patch:
- Given that a particular NM is only interested in the Applications that are
running on it, is there some way to have it only receive the aggregator info
for those apps? This would decrease the amount of throw away data that gets
sent.

Also, can you update the design doc? Looking at the patch, it seems like some
things have changed. (e.g. it's using protobufs instead of REST; which I think
makes more sense here anyway).

[Aggregator wireup] Implement ATS writer service discovery
--

Key: YARN-3039
URL: https://issues.apache.org/jira/browse/YARN-3039
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
Attachments: Service Binding for applicationaggregator of ATS
(draft).pdf, YARN-3039-no-test.patch

Per design in YARN-2928, implement ATS writer service discovery. This is
essential for off-node clients to send writes to the right ATS writer. This
should also handle the case of AM failures.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-24 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335010#comment-14335010
]

Junping Du commented on YARN-3039:
--

Thanks [~zjshen] for review and comments!
bq. I think so, too. RM has its own builtin aggregator, and RM directly writes
through it.
I have a very basic question here: didn't we want a singleton app aggregator
for all app related events, logs, etc.? Ideally, only this singleton aggregator
can have magic to sort out app info in aggregation. If not, we can even give up
current flow NM(s) - app aggregator(deployed on one NM) - backend and let
NM to talk to backend directly for saving hop for traffic. Can you clarify more
on this?

bq. in the heartbeat, instead of always sending the snapshot of the aggregator
address info, can we send the incremental information upon any change happens
to the aggregator address table. Usually, the aggregator will not change it
place often, such that we can avoid unnecessary additional traffic in most
heartbeats.
That's a very good point for discussion.
The interesting thing here is only we can compare with info from client (NM),
then we can know what is alternated in server (RM) since last heartbeat. Take
token update for example (populateKeys() in ResourceTrackerService), our
current implementation is: we encoded master keys (ContainerTokenMasterKey and
NMTokenMasterKey) known by NM in request, then in response we can filter out
old keys that already known by NM. IMO, this (put everything in request, and
put something/nothing in response) doesn't have any optimization against the
way we put nothing in request and put everything in response, but only turn
outbound traffic into inbound and bring compare logic in server side. Isn't it?
Another optimization we can think here is to let client express its interested
app aggregators on the request (with adding them to a new optional field, e.g.
InterestedApps) when it found these info are missing or stale, and server only
loop related app aggregators info in. NM can maintain an interested app
aggregator list, which get updated when first time app's container get launched
or app's aggregator info get stale (may reported in writer/reader's retry
logic) and items from list get removed when received from heartbeat response.
Thoughts?

bq. One addition issue related the rm state store: calling it in the update
transition may break the app recovery. The current state instead of the final
state will be written into the store. If RM stops and restarts at this moment,
this app can't be recovered properly.
Thanks for reminding on this. This is something I am not 100% sure. However,
from recoverApplication() in RMAppManager, I didn't see we cannot recover app
in RUNNING or other state (except final states, like: killed, finished, etc.).
Do I miss anything on this? One missing piece of code indeed here is I forget
to repopulate aggregatorAddr from store in RMAppImpl.recover(), will add it
back in next patch.

[Aggregator wireup] Implement ATS writer service discovery
--

Per design in YARN-2928, implement ATS writer service discovery. This is
essential for off-node clients to send writes to the right ATS writer. This
should also handle the case of AM failures.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-24 Thread Naganarasimha G R (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335118#comment-14335118
]

Naganarasimha G R commented on YARN-3039:
-

Hi [~djp]
Thanks for the doc which gives better understanding of the flow now .
Few queries :
* I feel AM should be informed of AggregatorAddr as early as register itself
than currently being done in ApplicationMasterService.allocate().
* For NM's too, would it be better to update during registering itself (may be
recovered during recovery, not sure though) thoughts ?
* Was not clear about source of RMAppEventType.AGGREGATOR_UPDATE. Based on
YARN-3030 (Aggregators collection through NM's Aux service),
PerNodeAggregatorServer(Aux service) launches AppLevelAggregatorService, so
will AppLevelAggregatorService inform RM about the aggregator for the
application? and then RM will inform NM about the appAggregatorAddr as part of
heart beat response ? if this is the flow will there be chances of race
condition where in before NM gets appAggregatorAddr from RM, NM might require
to post some AM container Entities/events?

[~zjshen],
* bq. Ideally, only this singleton aggregator can have magic to sort out app
info in aggregation. If not, we can even give up current flow NM(s) - app
aggregator(deployed on one NM) - backend and let NM to talk to backend
directly for saving hop for traffic. Can you clarify more on this?
I also want some clarification on similar lines ; whats the goal in having one
app one aggregator ? Is it for simple aggregation of metrics related to a
application entity or any entity(flow, flow run, app specific etc...) ? If so
do we require to aggregate for System entities ? May be based on this it will
be more clear to get the complete picture
* In one of the your's comments(not in this jira), you had mentioned that we
might require to start per app aggregator only if app requests for it. In that
case how will we capture container entities and its events if app does not
request for per app aggregator ?

[Aggregator wireup] Implement ATS writer service discovery
--

Per design in YARN-2928, implement ATS writer service discovery. This is
essential for off-node clients to send writes to the right ATS writer. This
should also handle the case of AM failures.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-23 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1409#comment-1409
]

Junping Du commented on YARN-3039:
--

Thanks [~sjlee0] for comments!
bq. I'm also thinking that option 2 might be more feasible, mostly from the
standpoint of limiting the risk. Having said that, I haven't followed YARN-913
closely enough to see how close it is...
I was thinking the same. As discussed with [~vinodkv] offline, we prefer to
start the work immediately based on current implemented features on YARN.
[~rkanter], please let us know if you have different ideas here.

bq. The service discovery needs to work across all these different modes: NM
aux service, standalone per-node daemon, and standalone per-app daemon. That
needs to be one of the primary considerations in this.
Agree. I think things don't change here is still three counterparts - AM, NM
and RM that need to know the service info (url for rest api), so we put RM here
as a center point for registration. The things could be different in your modes
mentioned above is who and how to do the registration. I would prefer some
other JIRA, like: YARN-3033, could address these differences. Thoughts?

bq. The RM will likely not use the service discovery. For example, for RM to
write the app started event, the timeline aggregator may not even be
initialized yet.
That's a very good point. We need RM to write some initiative app info
standalone. However, do we expect RM to write all app-specific info or just in
the beginning? We have a similar case in launching app's container - the first
AM container get launched by RM, but following containers get launched by AM.
Do we want to follow this pattern if we want to consolidate all app info with
only one app aggregator?

bq. If the AM fails and starts in another node, the existing per-app aggregator
should be shut down, and started on the new node. In fact, in the aux service
setup, that comes most naturally. So I think we should try to keep that as much
as possible.
As I said in proposal, we should do the best effort to locate two things
together. However, I think we also want to decouple the life cycle of these two
things which could make things more robust. Beside case of aggregator live but
AM die, another quick example is: AM container works fine, but aggregator on
this NM cannot be bind/started (for some reason, e.g. port is banned, etc.). In
those cases, we may not want to kill AM container (or aggregator service) for
aggregation locality reason given these are rarely cases so keep simple should
be better.

bq. We're talking about the aggregator failing as a standalone daemon, correct?
Yes and No. Even as auxiliary service of NM, aggregator could failed alone for
some reasons, e.g. port is blocked, etc. Am I missing anything here?

[Aggregator wireup] Implement ATS writer service discovery
--

Key: YARN-3039
URL: https://issues.apache.org/jira/browse/YARN-3039
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
Attachments: Service Binding for applicationaggregator of ATS
(draft).pdf

Per design in YARN-2928, implement ATS writer service discovery. This is
essential for off-node clients to send writes to the right ATS writer. This
should also handle the case of AM failures.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333709#comment-14333709
 ] 

Hadoop QA commented on YARN-3039:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12700214/YARN-3039-no-test.patch
  against trunk revision fe7a302.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 8 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId
  
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.TestRMContainerImpl
  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerDynamicBehavior
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
  
org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates
  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate
  
org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage
  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId
  
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore
  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6700//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6700//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6700//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6700//console

This message is automatically generated.

 [Aggregator wireup] Implement ATS writer service discovery
 --

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334159#comment-14334159
 ] 

Zhijie Shen commented on YARN-3039:
---

bq. The RM will likely not use the service discovery. For example, for RM to 
write the app started event, the timeline aggregator may not even be 
initialized yet.

I think so, too. RM has its own builtin aggregator, and RM directly writes 
through it. 

Thanks for the patch, Junping! One suggestion: in the heartbeat, instead of 
always sending the snapshot of the aggregator address info, can we send the 
incremental information upon any change happens to the aggregator address 
table. Usually, the aggregator will not change it place often, such that we can 
avoid unnecessary additional traffic in most heartbeats.

 [Aggregator wireup] Implement ATS writer service discovery
 --

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, YARN-3039-no-test.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334206#comment-14334206
 ] 

Zhijie Shen commented on YARN-3039:
---

One addition issue related the rm state store: calling it in the update 
transition may break the app recovery. The current state instead of the final 
state will be written into the store. If RM stops and restarts at this moment, 
this app can't be recovered properly.

 [Aggregator wireup] Implement ATS writer service discovery
 --

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, YARN-3039-no-test.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-20 Thread Sangjin Lee (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329737#comment-14329737
]

Sangjin Lee commented on YARN-3039:
---

Thanks [~djp] for the doc!

Some high level comments:
- I'm also thinking that option 2 might be more feasible, mostly from the
standpoint of limiting the risk. Having said that, I haven't followed YARN-913
closely enough to see how close it is...
- The service discovery needs to work across all these different modes: NM aux
service, standalone per-node daemon, and standalone per-app daemon. That needs
to be one of the primary considerations in this.
- The failure scenarios need more details in their own right; for this JIRA, I
think it is sufficient to see how it may impact the service discovery and
design just enough.

{quote}
We need a perapplication logical aggregator for ATS which provides aggregator
service in
form of REST API to: RM, AM and NMs,
{quote}
The RM will likely not use the service discovery. For example, for RM to write
the app started event, the timeline aggregator may not even be initialized yet.

{quote}
However, AM container could be reschedule to other
node for some reason (container failure, etc.), so we cannot guarantee the two
are
always together.
{quote}
If the AM fails and starts in another node, the existing per-app aggregator
should be shut down, and started on the new node. In fact, in the aux service
setup, that comes most naturally. So I think we should try to keep that as much
as possible.

{quote}
Failure Cases: 3. Aggregator failed (only):
{quote}
We're talking about the aggregator failing as a standalone daemon, correct?

[Aggregator wireup] Implement ATS writer service discovery
--

Key: YARN-3039
URL: https://issues.apache.org/jira/browse/YARN-3039
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
Attachments: Service Binding for applicationaggregator of ATS
(draft).pdf

Per design in YARN-2928, implement ATS writer service discovery. This is
essential for off-node clients to send writes to the right ATS writer. This
should also handle the case of AM failures.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

2015-02-20 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329084#comment-14329084
 ] 

Junping Du commented on YARN-3039:
--

Hi [~rkanter], thanks for sharing your thoughts here. 
I think as a generic, external service for YARN, YARN-913 may not meet our 
particular requirements here, like: 
- timeline service will serve as build-in service, not necessary for 
application to register service explicitly
- NM also need this aggregators info to aggregate info related to containers 
running on top of it.
- We have preference to bind service to local node of AM container
- Now, the launching of NM aggregators is not in way of YARN service container 
(see YARN-3033)
Also, I think we may not want this built-in service (as a standalone feature) 
to depends on another big feature in progress when unnecessary. Thoughts?

 [Aggregator wireup] Implement ATS writer service discovery
 --

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter

 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery

10 matches

Site Navigation

Mail list logo

Footer information