[ https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696354#comment-14696354 ]
Sergey Shelukhin commented on HIVE-11526: ----------------------------------------- We can start with building "realtime" view of LLAP state. Because there's no central AM for LLAP cluster, it's good to build a separate service. You can see LlapDaemon class for an example of how to create a daemon. LlapServiceDriver builds the spec for slider to run LLAP cluster, so this service should be added to that. It probably doesn't need a lot of memory or other resources. LLAP exposes JMX view via HTTP, on port 15002 by default (it's in LlapConfiguration file), I think it should be in the doc I've sent. So for starters this new service could just locate the LLAP daemons (see for example how Yarn registry used to do this in Tez AM - LlapTaskSchedulerService has registry and activeInstances fields that it uses to keep track of them), and then periodically scrape the JMX view (it's JSON iirc), and produce some view. You can see what useful counters and data this view already has (such as CPU/memory/etc. info, cache metrics, executor state). It should then expose a view of its own. For now, it can just produce a JMX text view similar to what LLAP daemon produces, but with aggregated and summarized information. You can see how YARN web app is created for that in LlapWebServices file and related files; to get JMX output, see interfaces marked with MXBean annotation, like LlapDaemonMXBean, and calls to MBeans.register. Whatever is registered will be output into jmx view by the web app. This is probably better if broken down into multiple subtasks (create daemon, run it, scrape data, output data, maybe others), feel free to open them. This would be a good start. Then if we have this standing, we can think about having UI, some more advanced views, what info to add to JMX, etc. > LLAP: implement LLAP UI as a separate service > --------------------------------------------- > > Key: HIVE-11526 > URL: https://issues.apache.org/jira/browse/HIVE-11526 > Project: Hive > Issue Type: Sub-task > Reporter: Sergey Shelukhin > Assignee: Kai Sasaki > > The specifics are vague at this point. > Hadoop metrics can be output, as well as metrics we collect and output in > jmx, as well as those we collect per fragment and log right now. > This service can do LLAP-specific views, and per-query aggregation. > [~gopalv] may have some information on how to reuse existing solutions for > part of the work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)