[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488142#comment-16488142
 ] 

Todd Lipcon commented on HIVE-19685:
------------------------------------

[~prasanth_j] sorry, didn't see your question as it came while I was writing 
the above comment. I think I partially answered it, but I'll give slightly more 
color:

- the default configuration would be a "no-op" tracer, which should have no 
measurable overhead.
- if you drop one of the tracer implementations onto the classpath, it's up to 
you to configure it and provide your own trace collection infrastructure. In 
the case of Jaeger, for example, I've been using a configuration like:

{code}
export JAEGER_SERVICE_NAME=hms
export JAEGER_AGENT_HOST=my-machine.example.com
export JAEGER_AGENT_PORT=6831
export JAEGER_REPORTER_FLUSH_INTERVAL=1000
export JAEGER_SAMPLER_TYPE=const
export JAEGER_SAMPLER_PARAM=1
{code}

And on my-machine.example.com I run some docker images provided by the Jaeger 
community. The simplest docker image they provide uses an in-memory store, but 
it can also write to Cassandra or Elastic Search as backends. It also provides 
the UI as seen in my screenshot.

Personally I've found this very useful to understand HMS performance issues 
during development, but I'm not sure if many end-users who deploy Hive would 
bother to set it up. IMO that's OK -- we can treat it as a dev-only feature 
without adding much maintenance burden.

> OpenTracing support for HMS
> ---------------------------
>
>                 Key: HIVE-19685
>                 URL: https://issues.apache.org/jira/browse/HIVE-19685
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Todd Lipcon
>            Priority: Major
>         Attachments: trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to