Gour Saha created SLIDER-1216:
---------------------------------
Summary: [Phase 2] Increase Slider applications (live/dead)
debuggability by providing container (live and dead) diagnostics from cmd-line
and YARN status
Key: SLIDER-1216
URL: https://issues.apache.org/jira/browse/SLIDER-1216
Project: Slider
Issue Type: Bug
Components: appmaster, client
Affects Versions: Slider 0.91
Reporter: Gour Saha
Assignee: Gour Saha
Fix For: Slider 1.0.0
Today, the options to debug a failing Slider application are painful. One
option is to traverse several links in RM UI starting from the application link
and going all the way down to the container logs. An app-owner might have
access to a gateway, but still logs might not be available until the app dies
if the cluster is not enabled with rolling log aggregation.
Slider provides capability to create apps with friendly names and hence to a
certain extent hides the YARN application id. It is not difficult to find the
YARN application id, but then app owners are more used to referring their apps
by their well-known names. All interactions from the command line using the
Slider client requires only the app name.
It would be great to provide container diagnostics (live and dead) like
absolute links to container logs in RM UI (links for live and dead containers
will be different), additional YARN-level diagnostics (specifically for
failed/killed containers), etc. With the absolute log links, an app owner can
directly jump to the container logs without having to hunt through RM UI. All
these information should be made available from Slider client so that
app-owners can query directly from cmd-line using app names. Consumers of
Slider client as an SDK will be able to call appropriate APIs and get these
diagnostic information. For example, Ambari Slider Views can now show these
diagnostics directly in the Ambari UI, relieving the app owners of the pain to
traverse the RM UI. Eventually, when the app dies/completes these container
diagnostics should be published to YARN status, such that debugging of a failed
application becomes easy as well.
At a high level, I am thinking of a cmd-line like -
slider diagnostics --name <app-name> --containers
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)