[ https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488139#comment-16488139 ]
Todd Lipcon commented on HIVE-19685: ------------------------------------ Before working on a patch against trunk, wanted to run a summary of the design by folks: - add POM dependencies from the metastore module to opentracing -- the "opentracing" project itself is just APIs, not coupled to any tracing implementation. You can think of it like Slf4j where the user has to provide an implementation of their choice. opentracing supports a number of implementations, the most popular being Jaeger (from Uber) and Zipkin (from Twitter) as well as a commercial implementation provided by LightStep. -- we'd also include the 'tracerresolver' module. This uses a Java ServiceLoader to look for appropriate plugins on the classpath at start time. This would allow a user to drop Jaeger or Zipkin onto the classpath and enable tracing without recompilation. The tracing implementation's configuration is implementation-specific. For example, Jaeger's configuration is by environment variables. - add a POM dependency to opentracing-thrift, which is some simple utility code to wrap a TProtocol and TProcessor so that the client and server propagate a trace context between them. This allows a trace to be correlated between two processes (eg HS2 and HMS). We might want to shade these classes since they'd show up in consumer classpaths who are using the HMS client. In order to get the tracing of JDBC calls as shown in the screenshot above, no code is necessary. The user just adds the opentracing-jdbc jar to their classpath and then appropriately configures their JDBC connection string. It acts like a "passthrough" to the underlying JDBC driver. The above is the basic integration. Beyond that, we can add small bits of instrumentation to interesting points of the code. For example: {code} private boolean ensureDbInit() { try (Scope s = GlobalTracer.get().buildSpan("MetaStoreDirectSQL.ensureDbInit") .startActive(true)) { .... guts of method ... + } } {code} this makes it easy to spot issues like HIVE-19310. Thoughts? > OpenTracing support for HMS > --------------------------- > > Key: HIVE-19685 > URL: https://issues.apache.org/jira/browse/HIVE-19685 > Project: Hive > Issue Type: New Feature > Components: Metastore > Reporter: Todd Lipcon > Priority: Major > Attachments: trace.png > > > When diagnosing performance of metastore operations it isn't always obvious > why something took a long time. Using a tracing framework can provide an > end-to-end view of an operation including time spent in dependent systems (eg > filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate > OpenTracing, which is a vendor-neutral tracing API into the HMS server and > client. -- This message was sent by Atlassian JIRA (v7.6.3#76005)