[
https://issues.apache.org/jira/browse/YARN-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205252#comment-16205252
]
Allen Wittenauer edited comment on YARN-7127 at 10/15/17 7:10 PM:
--
I thought some more about this topic this morning and had two more things to
add:
1) I think an AM should have a way to tell the RM about any extra capabilities
it might have. This feature isn't particularly useful for the RM, but it would
be beneficial for any clients. For example, the MR AM might tag itself as
"jobtracker" to note that it supports the extra features that the 'mapred'
command uses. A Slider AM might tag itself as 'slider' or 'native' or whatever
to signify that it supports those extensions. etc. etc. That would make
extending the yarn application subcommand MUCH easier and potentially even open
the door for extensions/plug-ins to that command from third parties. For
example, turning the extra mapred subcommands into a hook off of yarn
application would allow us to ultimately kill the mapred command once the
timeline server is capable of doing everything that the history server can.
2) A large part of the discussion here is fueled by contradicting views on this
project's place within Hadoop. If one takes the belief that it's "just another
framework, like MapReduce," then creating separate sub-commands, documentation,
daemons, etc. seems logical. If one takes the view that it's "part of YARN,"
then adding new sub-commands, a separate documentation section, and a ton of
new daemons does not make sense.
But it doesn't appear that either of those choices has been made. Portions of
the code base are in the separate framework type of mold, but other changes are
to core YARN functionality, even if we push aside "obviously part of YARN" bits
like RegistryDNS.
It seems as though the folks working on this branch need to make that decision
and drive it to completion: is it part of YARN or is it not? If it's the
former, then that means full integration: no more separate API daemon, no
different subcommand structure, etc., etc. If it's the latter, then that means
total separation: it needs to be a separate subproject, no shared code base,
new top-level command, etc., etc.
Having a foot in both is what is ultimately driving this disagreement and will
eventually confuse users.
was (Author: aw):
I thought some more about this topic this morning and had two more thoughts:
1) I think an AM should have a way to tell the RM about any extra capabilities
it might have. This feature isn't particularly useful for the RM, but it would
be beneficial for any clients. For example, the MR AM might tag itself as
"jobtracker" to note that it supports the extra features that the 'mapred'
command uses. A Slider AM might tag itself as 'slider' or 'native' or whatever
to signify that it supports those extensions. etc. etc. That would make
extending the yarn application subcommand MUCH easier and potentially even open
the door for extensions/plug-ins to that command from third parties. For
example, turning the extra mapred subcommands into a hook off of yarn
application would allow us to ultimately kill the mapred command once the
timeline server is capable of doing everything that the history server can.
2) A large part of the discussion here is fueled by contradicting views on this
project's place within Hadoop. If one takes the belief that it's "just another
framework, like MapReduce," then creating separate sub-commands, documentation,
daemons, etc. seems logical. If one takes the view that it's "part of YARN,"
then adding new sub-commands, a separate documentation section, and a ton of
new daemons does not make sense.
But it doesn't appear that either of those choices has been made. Portions of
the code base are in the separate framework type of mold, but other changes are
to core YARN functionality, even if we push aside "obviously part of YARN" bits
like RegistryDNS.
It seems as though the folks working on this branch need to make that decision
and drive it to completion: is it part of YARN or is it not? If it's the
former, then that means full integration: no more separate API daemon, no
different subcommand structure, etc., etc. If it's the latter, then that means
total separation: it needs to be a separate subproject, no shared code base,
new top-level command, etc., etc.
Having a foot in both is what is ultimately driving this disagreement and will
eventually confuse users.
> Merge yarn-native-service branch into trunk
> ---
>
> Key: YARN-7127
> URL: https://issues.apache.org/jira/browse/YARN-7127
> Project: Hadoop YARN
> Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-712