[
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696354#comment-14696354
]
Sergey Shelukhin commented on HIVE-11526:
-----------------------------------------
We can start with building "realtime" view of LLAP state. Because there's no
central AM for LLAP cluster, it's good to build a separate service. You can see
LlapDaemon class for an example of how to create a daemon. LlapServiceDriver
builds the spec for slider to run LLAP cluster, so this service should be added
to that. It probably doesn't need a lot of memory or other resources.
LLAP exposes JMX view via HTTP, on port 15002 by default (it's in
LlapConfiguration file), I think it should be in the doc I've sent. So for
starters this new service could just locate the LLAP daemons (see for example
how Yarn registry used to do this in Tez AM - LlapTaskSchedulerService has
registry and activeInstances fields that it uses to keep track of them), and
then periodically scrape the JMX view (it's JSON iirc), and produce some view.
You can see what useful counters and data this view already has (such as
CPU/memory/etc. info, cache metrics, executor state).
It should then expose a view of its own.
For now, it can just produce a JMX text view similar to what LLAP daemon
produces, but with aggregated and summarized information. You can see how YARN
web app is created for that in LlapWebServices file and related files; to get
JMX output, see interfaces marked with MXBean annotation, like
LlapDaemonMXBean, and calls to MBeans.register. Whatever is registered will be
output into jmx view by the web app.
This is probably better if broken down into multiple subtasks (create daemon,
run it, scrape data, output data, maybe others), feel free to open them.
This would be a good start.
Then if we have this standing, we can think about having UI, some more advanced
views, what info to add to JMX, etc.
> LLAP: implement LLAP UI as a separate service
> ---------------------------------------------
>
> Key: HIVE-11526
> URL: https://issues.apache.org/jira/browse/HIVE-11526
> Project: Hive
> Issue Type: Sub-task
> Reporter: Sergey Shelukhin
> Assignee: Kai Sasaki
>
> The specifics are vague at this point.
> Hadoop metrics can be output, as well as metrics we collect and output in
> jmx, as well as those we collect per fragment and log right now.
> This service can do LLAP-specific views, and per-query aggregation.
> [~gopalv] may have some information on how to reuse existing solutions for
> part of the work.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)