[
https://issues.apache.org/jira/browse/SLIDER-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123192#comment-14123192
]
thomas liu commented on SLIDER-158:
-----------------------------------
After discussing the requirement offline, here is the first version of the
requirement. Meanwhile, sending a separate email to slider-dev@ to receive more
suggestions.
Synopsis:
slider diagnostics [required options] [additional pararm]
Description:
The command will print out diagnostic information about the slider application,
which can be used for debugging, installation verification, etc. It receives
one required option which tells the command which specific information about
the slider application the user cares about, for example, '--client' will print
out information about slider client on user's local host, including the path
and version of the JDK to run the slider client, path and version of the slider
client, path to the slider-client.xml...
It also comes with an option to intelligently check various
configuration/status of the slider application step by step to see if the
application can run properly and prints out the check result to the user. If
the application identified by the application name passed in as a parameter has
not been started by slider yet, then slider will check whether there are enough
resources that can be allocated to the application; if the application is
already running, then slider will check whether Slider AM has started right
number of instances per role for the application and their running status.
Note the command doesn't print out the status about the applications, as it can
be retrieved through list or status option of slider command, though it will
show the installation configuration of the application through its
--application option
Options:
-c
--client
This option prints out information about slider client:
the path and version of the slider command
the path to the slider-client.xml
the path and version of the JDK on which slider runs
-s
--slider
This option prints out information about slider installation on the cluster:
the path and permission of the slider agent tarball is on HDFS
the ‘run as’ user
the Python version and path
the JDK version and path
-a [APPLICATION_NAME]
--application [APPLICATION_NAME]
This option information about installation configuration of an application,
which requires a parameter to the CLI specifying the application name:
the location of cluster instance directory in HDFS
the path and permission of the application package tarball on HDFS
location of the appconfig.json used to start the application
location of resource.json used to start the application
-y
--yarn
This option prints out the information about the YARN cluster:
The version and path of the JDK on which YARN runs
The Hadoop version and installation path of the YARN host
-i [APPLICATION_NAME]
--intelligent [APPLICATION_NAME]
This option intelligently checks various configuration/status of the slider
application step by step to see if the application is running properly and
prints out the check result to the user. If the application identified by the
application name passed in as a parameter has not been started by slider yet,
then slider will check whether there are enough resources that can be allocated
to the application according to the definition in resource.json; if the
application is already running, then slider will check whether Slider AM has
started right number of instances per role for the application and their
running status. The order of checking is:
if the slider client can find and process slider-client.xml, appConfg.json,
resource.json
if the slider client can talk to the slider application master properly
+if the application is running:
+if the slider agents has started the right number of the application roles
properly according to definition in resource.json
+if not:
check if JDK and Python is installed on the hosts where slider AM
attempted to install the application role, and if JAVA_HOME is configured
properly
check if slider agent tarball is installed on HDFS according to the
--image option of slider create command
check if application tarball is available on HDFS according to the
appConfig.json
check if the runas user has the permission on HDFS to fetch those
tarballs
print out the error from slider.err(TBD)
+if the application has not started:
if there are enough resources that can be allocated to the application
according to the definition in resource.json by talking to resource manager.
Note, race condition may occur here as this command can only show if there are
enough resources at the moment of running this command, which may change when
the create cluster command is actually run
if the slider application master can find the tarball of the application on
HDFS
check if the runas user has the permission on HDFS to fetch those tarballs
> add a "slider diagnostics" command
> ----------------------------------
>
> Key: SLIDER-158
> URL: https://issues.apache.org/jira/browse/SLIDER-158
> Project: Slider
> Issue Type: New Feature
> Components: client
> Affects Versions: Slider 0.30
> Reporter: Steve Loughran
> Assignee: thomas liu
>
> Tech preview users are having problems, and it is very hard for us to
> diagnose what is going on.
> I propose a {{slider diagnostics}} command to print out the client-side
> state, look at HDFS and see what's up there
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)