[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807619#comment-16807619 ] Elek, Marton commented on HADOOP-15566: --- Thanks the questions [~bogdandrutu] q. What implementation will be shipped with the official HBase binary? I don't know it depends from the HBase imho. With using OT we can use multiple implementation, HBase can provide any implementation. q. How can somebody use a different implementation? It should be configurable. AFAIK the only vendor specific part is the initialization code. It's easy to create an interface to initialize different implementation (eg. a class name which should be called to initialize the implementation. q. How do you ensure that a different implementation (that is not tested with your entire test suite) may not corrupt user data? I think it is very important that all the tests are running with the implementation that user uses in production. I don't think that we need to test all the implementation. We should prove that the OT api used well and use one implementation as an example. And we clearly write the documentation what is tested and what is not. Not tested implementations which are provided by other vendors can be used but should be tested before by the users. q. use one implementation (pick something that you like the most) and export these data to an configurable endpoint Interesting. Can you please give me more details? What is the configurable endpoint? How the tracing information would be stored? > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757657#comment-16757657 ] Bogdan Drutu commented on HADOOP-15566: --- [~elek] - I have some questions about OT integration: 1) What implementation will be shipped with the official HBase binary? 2) How can somebody use a different implementation? 3) How do you ensure that a different implementation (that is not tested with your entire test suite) may not corrupt user data? I think it is very important that all the tests are running with the implementation that user uses in production. FYI: I know I am biased but I think that a different approach is better here, use one implementation (pick something that you like the most) and export these data to an configurable endpoint. Then every vendor can consume that format. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752975#comment-16752975 ] Elek, Marton commented on HADOOP-15566: --- Thanks the idea [~cmccabe]. It's interesting. Fix me If I am wrong, but as I see the HTrace is not designed to be extensible. For example the Span is an interface but Tracer always creates the MilliSpan implementation. To use HTrace as a lightweight layer and support multiple tracing implementation (such as opentracing or opencensus) we need to refactor the HTrace code. I have two problems with this approach: 1) The new refactored HTrace won't be compatible the old HTrace. Would be hard to support old HTrace. 2) It wold be equivalent to resurrect the HTrace which is voted to retire. (The some thing can be done without importing HTrace code to the Hadoop but refactor it on the HTrace side) But it's a valid concern about creating a new layer (even if Cassandra also followed this approach as @mck wrote it). For me it's hard to compare the complexity of maintain an own lightweight abstraction layer and maintaining HTrace. (Even if the first one seems to be easier). I think the real alternative here is just to use OpenTracing (despite the concerns about the governance raised by [~michaelsembwever]) And follow the approach which is prototyped by [~jojochuang], [~fabbri], [~rizaon]) Or (as a first step) it could be added to the existing HTrace code, side-by-side, to evaluate it. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751408#comment-16751408 ] Colin P. McCabe commented on HADOOP-15566: -- HTrace *is* "a lightweight Hadoop API for the tracing where multiple implementation can be plugged in." :) The "H" originally stood for "Hadoop." So you could just move the HTrace API classes into hadoop-common, and then have people continue using Zipkin or something as the backend. And / or write an opentracing backend to interface with those systems. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714435#comment-16714435 ] Elek, Marton commented on HADOOP-15566: --- Thanks [~cmccabe], I agree with your points about the importance of the compatibility and to keep the htrace support. My proposal is: 1.) Create a lightweight Hadoop API for the tracing where multiple implementation can be plugged in 2.) Provide a default implementation which uses the existing htrace code. Implementation details: a) Add a new optional bytes field for the RpcHeader. Different tracing libraries could require different size of serialized context: {code:java} diff --git a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto index aa146162896..e42f64eb631 100644 --- a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto +++ b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto @@ -61,9 +61,9 @@ enum RpcKindProto { * what span caused the new span we will create when this message is received. */ message RPCTraceInfoProto { optional int64 traceId = 1; // parentIdHigh optional int64 parentId = 2; // parentIdLow +optional bytes tracingContext = 3; //generic tracingInformation } {code} This is a a backward-compatible change. b) In the rpc Server.java a (htrace) TraceScope is initialized based on the rpc header and propagated as part of the RpcCall: {code:java} RpcCall call = new RpcCall(this, header.getCallId(), header.getRetryCount(), rpcRequest, ProtoUtil.convert(header.getRpcKind()), header.getClientId().toByteArray(), traceScope, callerContext); {code} I propose to replace this traceScope with a hadoop specific TraceScope marker interface. The default implementation could be a simple class which contains the htrace implementation. c. We can create a simple Tracing singleton (similar to the DefaultMetricsSystem): Example call: {code:java} try (TracingSpan context = HadoopTracing.INSTANCE.newContext(call.tracingSpan, "RpcServerCall")) { if (remoteUser != null) { remoteUser.doAs(call); } else { call.run(); } } {code} d. HadoopTracing could be something like this: {code:java} package org.apache.hadoop.tracing; public enum HadoopTracing { INSTANCE; private TracingProvider provider; public TracingSpan importContext(byte[] data) { return provider.importContext(data); } public byte[] exportContext() { return provider.exportContext(); } public TracingSpan newContext(String name) { return provider.newContext(name); } public TracingSpan newContext(TracingSpan parentSpan, String name) { return null; } } {code} e. We can add multiple TracingProvider (and provide one for Htrace for compatibility reason.) +1. Personally I prefer to use some utility which adds trace support to specific methods which are annotated. It could simplify the usage of the tracing but requires java proxy. But this is an independent question. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705267#comment-16705267 ] Sean Busbey commented on HADOOP-15566: -- bq. Sean Busby did a lot of work on shading the Hadoop CP --targeting HBase, but it's not been rounded off with all the hadoop-tools modules yet, including the cloud storage connectors. Someone needs to volunteer to embrace shading I don't want to get this jira sidetracked, but could you point me at more details on the gap here? I was under the assumption that hadoop-tools stuff was project internal and thus didn't need shading. In the downstream facing shading we expressly don't shade HTrace because doing so breaks some of its functionality (tracing from application through libraries within the same JVM). > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704820#comment-16704820 ] Steve Loughran commented on HADOOP-15566: - bq. It is definitely sad that it didn't make it out of the incubator. There is clearly a need for this kind of work in Hadoop and in other projects yes it is sad, yes there is a need. Sean Busby did a lot of work on shading the Hadoop CP --targeting HBase, but it's not been rounded off with all the hadoop-tools modules yet, including the cloud storage connectors. Someone needs to volunteer to embrace shading > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704118#comment-16704118 ] Colin P. McCabe commented on HADOOP-15566: -- Hi folks, I just saw this JIRA while searching for something else. I was one of the guys who worked on HTrace, both on the Hadoop integration side and on the HTrace project itself. It is definitely sad that it didn't make it out of the incubator. There is clearly a need for this kind of work in Hadoop and in other projects. I don't have a strong opinion about which other tracing API should be used in Hadoop. I would caution everyone that Hadoop's compatibility shackles are heavy -- very heavy indeed. Just to give an example, a typical Hadoop installation might have HDFS, HBase, and Phoenix installed. These projects all have separate developers, PMCs, and release cycles, but expect to be able to share the same CLASSPATH happily. Projects often push back very hard on trying to update library dependencies, especially in "minor" releases. To add to that, people often stay on older stable versions of Hadoop for years. In theory, Hadoop vendors offer a snaphot of the full Hadoop stack, carefully configured so that things work together. In practice, libraries are not always harmonized as well as we would like. Some users want to mix and match versions of things, or not even use a vendor distribution at all. This makes setting up end-to-end tracing pretty difficult. There were some efforts to add better CLASSPATH isolation to Hadoop. I haven't kept up with those, so I don't know how much this situation has improved. I do think that the idea of keeping HTrace around as a shim API might make sense for Hadoop. This would mean that adding support for a new version of the OpenTracing or Zipkin library would only require updating that shim code in hadoop-common, rather than trying to coordinate changes across a dozen Hadoop projects. Also, HTrace already has code to export spans to Zipkin, if that helps. I think it would be relatively straightforward to write the same thing for opentracing as well. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623548#comment-16623548 ] Carlos Alberto Cortez commented on HADOOP-15566: Hi all, I went ahead and did a Proof-Of-Concept migration from HTrace to OpenTracing (using Zipkin as the backend). You can inspect (and play with the code) here: [https://github.com/apache/hadoop/compare/trunk...carlosalberto:ot_initial_integration] Some notes: 1. It creates by default a Tracer instance based on Zipkin running in localhost (for simplicity purposes). 2. It uses the notion of a GlobalTracer so create and register and use the Tracer from a single place. 3. As Wei-Chiu mentioned, it needed some small extra work to pass around the parent-child relationship (which is done trough `SpanId` in HTrace, and `SpanContext` in OpenTracing). 4. Added a new SpanContext field in the clases using protobuf to pass trace info. As mentioned, this is a POC, but hope this can throw light into this (and I'm happy to answer questions or contribute with this as an actual migration ;) ) > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581746#comment-16581746 ] Andrew Purtell commented on HADOOP-15566: - What about a HTrace facade for Brave (Zipkin)? > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581273#comment-16581273 ] stack commented on HADOOP-15566: [~elek] Thanks. Or we could just strip htrace. This would remove any friction caused by its injection. This would address the issue title and bulk of the description. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580831#comment-16580831 ] Elek, Marton commented on HADOOP-15566: --- It seems that there is no consensus, yet. On the other hand (AFAIK) htrace is used only at a few places in the hadoop source tree. Can we create a very lightweight hadoop specific tracing builder and use it in the hadoop code? And a generic field to the rpc? Is it possible to support multiple tracing implementations? (Existing HTrace could be the default implementation and we can provide ot/oc implementations). > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563253#comment-16563253 ] Elek, Marton commented on HADOOP-15566: --- Just for the reference, the links for the started mailing list discussions: https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201807.mbox/%3CCADiq6%3DxdAuPT5q8PNdXBnSODzniKw2zBGo-z9PwCA2_mrDc7wg%40mail.gmail.com%3E https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201807.mbox/%3cCADcMMgEkJ=OqhJ83-aPFQZ+TZ+5BH=7w6-tsahd9hlpuc3e...@mail.gmail.com%3e (Thanks to [~stack] and [~jojochuang]) > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563157#comment-16563157 ] Ted Young commented on HADOOP-15566: Hi there, I work on the OpenTracing project w/ Ben, thought I would weigh in! I feel like there is somewhat an apples to oranges comparison going on here. To clarify what we are trying to do with OpenTracing: * the instrumentation API should be an abstract interface, and should not expose implementation details. That's the whole point, it's not about additional features. * The fact that some clients ship with nifty features, such as z-pages, is actually an argument FOR an abstract interface, not against it. You can easily put a client with z-pages (or whatever new feature comes next) behind an abstract interface. Arguing that abstraction should be abandoned because a particular implementation has a useful feature doesn't make any sense. This no different than LightStep or any other vendor arguing that you should bake in their tracing client because it has a special feature. It's a form of implementation lock-in, which is easily avoided. The whole reason we've been working on an abstracted interface for the past several years is to decouple these choices. So it's not either/or. Use a good client behind an abstraction, that's all. * Likewise with a wire protocol. I also support the w3c protocol under development. But it is most definitely still under development. The v00 prototype version is still being mutated, and we haven't even had a meeting yet to compare notes about initial implementations. What would be the point in adding any instrumentation code which baked in something in this state? It's better to use this - or any other wire protocol that the users of a hadoop may want to use - behind an interface which allows them to swap it out without rewriting code. This includes swapping in future versions of the w3c headers. Again, just to reiterate: arguments about how particular clients may expose data usefully - or otherwise have special additional features - and arguments about the benefits of one wire protocol vs another, are actually arguments FOR an abstract instrumentation API. You really want these choices decoupled. Better implementation details may exist tomorrow, and the versioning/packaging of a tracing subsystem should be orthogonal to the versioning of Hadoop itself. Hope that adds some clarity! FWIW, I wrote a longer-form version of of my thinking here a couple months ago, if you want more detail: [https://opensource.com/article/18/5/distributed-tracing] > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562619#comment-16562619 ] stack commented on HADOOP-15566: bq. I just think we're going to be hard-pressed to make an informed decision without pairings of trace visualizations (ideally in many tracing systems to illustrate portability) and the respective instrumentation code to illustrate non-bloat / maintainability stack you were suggesting we try this on dev – any pointers to a non-HDFS / non-HBase expert for a place to focus on for such an exercise? Yeah. I just started a DISCUSS thread that points here up on dev-common. Hopefully, we'll attract doers/volunteers. What you thinking [~bensigelman]? You (or your company) running a compare of libs -- OT/OC/Hacked HTrace -- for a neutral party/volunteer to evaluate? bq. I wonder if it's would be worth evaluating writing a htrace-api->opentracing-java or htace-api->census or htrace-api->zipkin... I just did a refresher and unfortunately it'd be a bit of awkward work to do [~michaelsembwever]. HTrace core entities -- probably the font of friction (We'd have to check; we could for sure do some fixup around when no trace enabled) -- are classes rather than Interfaces and do work passing Spans though no trace enabled. The other awkward fact is that there are two htrace APIs afloat in Hadoop currently, an htrace3 in older Hadoops and an htrace4 (though in different packages). Getting traces into zipkin though should be easy enough. htrace dumps to spanreceiver implementations and these are easy to write and plugin. [~bogdandrutu] Thanks boss for the OC input. The local-view (z-pages) makes sense. Nice instrumentation example over in the hbase client for talking to (cloud) bigtable too (smile) -- https://github.com/GoogleCloudPlatform/cloud-bigtable-client. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562025#comment-16562025 ] BOGDAN DRUTU commented on HADOOP-15566: --- Hello all, First sorry for jumping into this issue, but I will try to be short (edited after I finished the comment: I was wrong) and as much possible project independent (for the record I am one of the main contributor in OpenCensus, also in my previous life I debugged a lot of BigTable issues using the same technology as OpenCensus). Some comments about other comments in this issue: [~bensigelman] - FYI: OpenCensus does not enforce any wire format. The format is configurable and we are adding support for the w3c standard. [~elek] - About OT vs OC my personal opinion is the philosophy behind these projects, OT was designed with a mindset of being an open-source API for vendors to implement and because of these certain tradeoffs were made to help some vendors (as [~michaelsembwever] mentioned), OC was designed to be a fully implemented library that supports multiple different backend (Zipkin, Jagger, Stackdriver, AppInsight, etc.) as well as in-process debugging capabilities. For example one of the key feature that I used a lot when I debugged BigTable issues is what OpenCensus calls z-pages (in-process handlers to track active requests, in-memory latency based sampled spans, stats, etc.). You can take a look here [https://opencensus.io/core-concepts/z-pages/#1]. Based on my small experience there are 3 components that are critical in the instrumentation of a service: # Wire propagation (I saw a previous discussion about this). [https://github.com/w3c/distributed-tracing] - it is a w3c standard proposed by couple of APM vendors and cloud providers. Even though the format is mostly focus on HTTP requests HBase can define their own format if needed, the only requirement being the ability to propagate all fields defined in the format (trace-id, span-id, trace-options and tracestate). This part is critical when HBase is used as a service (e.g. something like Google Bigtable which works with the HBase client), having standard fields that are propagated allows service owners to correlate incoming requests from a customer with the internal trace. Also similar issue may occur when only HDFS is used as a service. # APIs to start/end a span, record tracing events, etc. There are multiple open source APIs including (OpenCensus, OpenTracing, Zipkin, etc.). # In-process propagation. This can be implemented in two ways: explicitly propagate the current "Span" between function calls, runnable, callable, etc. or implicitly usually using a thread-local mechanism. From a previous comment from [~stack] about keeping this working, my personal experience is that you can achieve this using the "implicit" mechanism described before by having a clean context api (for an example of a context api that works good I can recommend the [https://grpc.io/grpc-java/javadoc/io/grpc/Context.html)] and ensure that all async calls are wrapped accordingly (e.g wrapping all Executors), the "explicit" mechanism may be very hard to maintain and based on my experience annoying for developers. This part is very important when instrumenting the HBase client (which I think should be instrumented in order to debug more complex issues) because the client is used as a library and a standard way to propagate the current Span is very important in order to continue the same trace between client application and bigtable client. When OpenCensus was designed I thought that it is very important that the library ensures all these 3 components are covered. Some may say that the 1) it is not important when deployed internally but with the new cloud providers this becomes more common, others may say that 3) it is not important but when instrument client libraries (like HBase client) this becomes very important in my opinion. FYI there are other libraries that solve these issues as well like Zipkin, etc. but I am not here to suggest one particular library, just to explain the concepts, issues and what is important to think about. In my personal opinion OpenTracing does not deal very well with 1 and 3 (probably on purpose) but I am not an expert in OpenTracing or one of the owner/author/co-author so I cannot comment on what is good or what is bad in their design choices. These are my thoughts about what you should consider when you pick one library vs other. Related to OpenCensus we are happy to help if you have any questions about our design choices, or about stats/metrics support in OpenCensus and why we think that these are very important as well. PS: Hope the comment makes sense, it became larger than expected but I tried to give an overview of the whole instrumentation issue. > Remove HTrace support > - > > Key:
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561363#comment-16561363 ] Adrian Cole commented on HADOOP-15566: -- TL;DR; I would advise evaluating all the options, perhaps by resurrecting a small part of htrace in order to give a more seamless migration and support path. This allows *sites* to participate in the decision making *before committing to an approach.* Depending on choices permitted, this might imply api or model changes to make it work.. doing this decoupled from hadoop moves the thrash to where it belongs. Ironically, while at twitter the data services team preferred htrace to zipkin, eventhough zipkin was there. It would be nice to both have a focus on brown field, like a solution that works with today and tomorrow. *Many won't upgrade hadoop for many years* to 3.1. Sites should be preferred and deferring input from them, we should try to act on their behalf... saying again thrash behind the api before considering thrashing an api. Resurrecting the "api" part would also allow a less conjectured guide to moving forward, one that has to firstly tackle concerns technically, such as parents. It is easy to say how something might work and another thing entirely to have it work, and have it work efficiently, and have it work in ways that are safe. Doing this buys more time to make informed decisions, have people who have never worked on data systems a chance to get that experience first. Even in services tracing, we've noticed a lot of things left to end users to sort out.. seems data services should have even more rigor. For example, HTrace code includes a lot of guards that prevent excess network communication. These things are inconsistent across OT as threading concerns are an implementation detail, there is neither a spec nor TCK on reporting, except some guidance to be good. Census one could conjecture would be good for hadoop if it is good for google internally with bigtable. However, even that shouldn't be left to conjecture. Many ecosystems have a fair amount of full time staff, and possibly could use those staff towards vetting of the concerns already implemented by the htrace libraries. Anyway I hope this response is not ruffling feathers.. I've tried hard to not have it do such. While less qualified than some to participate in this discussion, you can look at the source history and otherwise. I have personally fixed code here and elsewhere to make interop work. I also collaborated with a site owner to open up the transports. I primarily take care of the openzipkin volunteer community even if I am paid a salary. I don't make any more or less money if hadoop chooses one thing vs another. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561322#comment-16561322 ] mck commented on HADOOP-15566: -- bq. Thats this stuff: https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/tracing ? That's correct. It's a bit different i'm presuming for the hadoop ecosphere as its tracing api is htrace. So I'm speaking off-the-cuff but I wonder if it's would be worth evaluating writing a htrace-api->opentracing-java or htace-api->census or htrace-api->zipkin (as many backends now accept zipkin traces, in fact more than opentracing last time i checked, so zipkin might well be considered thee de facto standard atm) layer. But of these could form a template to help others to write htrace->xyz plugins. While htrace may be disappearing, maintaining just its api in this form for plugin, may not be a big deal, and provides end-to-end tracing in many *existing* hadoop ecosystems. bq. Could try re-emitting existing (h)traces to zipkin – it used to work – or whatever sink. Yup, that's what I was trying to explain above. But as a plugin. Folk will appreciate that they only need to instrument one api rather than a whole ecosystem again. And I wouldn't be comfortable betting on one abstraction layer over another, not right now. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561193#comment-16561193 ] Ben Sigelman commented on HADOOP-15566: --- Re the actual technical issue (there's a PS below about the more FUD-oriented points): rather than expecting maintainers of every ASF storage system *and* the maintainers of every distributed tracing system to (a) decide on the nuances of a data model, then (b) write bindings from "Storage System X's" tracing hooks to "Tracing System Y's" client library (for all combinations of X's and Y's), we can instrument the ASF storage systems with a single API that has been specifically designed to be portable. To address [~stack]'s question about performance, the noop implementation of OpenTracing tracers amount to an empty function call but avoid the costs and/or lock contention of generating random numbers, context objects, and so forth. Another point that [~stack] made: {quote}For me, the hard part is not which tracing lib to use – if a tracing lib discussion, lets do it out on dev? {quote} I 90% agree with this. Certainly as a response to [~michaelsembwever], in any case, I would be glad to see a side-by-side using OpenTracing vs "something custom" to understand the amount of *additional* work required to actually get end-to-end tracing to work. That said, doing the tracing lib analysis "on dev" should also take the application developer experience into account... whatever we decide to do must require a minimum of configuration work (or educational work) for application developers, and that means that we should think hard about being agnostic about the tracing system "above" the storage systems under consideration here – ideally we are able to plug into any of them without forcing the application developer / operator to write new code or go on a yak-shaving mission. As a concrete next step, I would be curious to see the code / branch that [~jojochuang] used to generate the OT+Jaeger screenshots above. I would also like to create a dev branch of HDFS or Cassandra that adds "native" OpenTracing instrumentation to a distributed code path that the HDFS devs think would be instructive/representative... I just think we're going to be hard-pressed to make an informed decision without pairings of trace visualizations (ideally in many tracing systems to illustrate portability) *and* the respective instrumentation code to illustrate non-bloat / maintainability. Would that be useful? [~stack] you were suggesting we try this on dev – any pointers to a non-HDFS / non-HBase expert for a place to focus on for such an exercise? {color:#707070}PS: {color}[~michaelsembwever]{color:#707070}, that was a lot of FUD to pack into one message ("bloat its API with vendor concerns", "hostile to the ASF", "hostile ... to those tracing solutions those vendors see as competition", etc). These concerns were also presented without any evidence – unsurprisingly, as I doubt that evidence exists. OpenTracing's two most common "pairings" are Zipkin and Jaeger, neither of which are commercial solutions. To the contrary of what you suggest, the API is intentionally – if not primarily – designed to focus on _describing system behavior_ rather than the concerns of any downstream tracing system (OSS or commercial). All OpenTracing meetings are recorded and the notes are public if people here would like to judge for themselves about the openness and intent of the actual decision process (as opposed to the one you described/imagined). For those who want a primer on what we're up to, I would recommend reading either [this doc that I wrote when we were just getting started|https://medium.com/opentracing/towards-turnkey-distributed-tracing-5f4297d1736], or [this more recent doc explaining how OT fits into the larger ecosystem|https://medium.com/opentracing/the-difference-between-tracing-tracing-and-tracing-84b49b2d54ea] that's developed in the interim.{color} > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561007#comment-16561007 ] stack commented on HADOOP-15566: Thanks for the input [~michaelsembwever]. bq. as the effort is more in adding the instrumentation code in the first place, and not so much writing the abstraction layer. Agree bq. With Cassandra ...of maintaining the existing tracing code as the abstraction layer, and allowing plugins to it. Thats this stuff: https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/tracing ? Could try re-emitting existing (h)traces to zipkin -- it used to work -- or whatever sink. Would also need to fix it so trace inserts are friction-free when disabled (currently they drag). > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560933#comment-16560933 ] mck commented on HADOOP-15566: -- I've become pretty ho-hum about OpenTracing, and I write that as one of the original authors to OpenTracing-Java. It's not the de facto abstraction layer many presume it to be. Having participated in the tracing community the past 5 years, being there as Zipkin became one community from many github forks into OpenZipkin, and now mentoring SkyWalking through the incubator process and into the ASF, I was at first a big fan of OT and promoted it at conferences. In the beginning it did hold a lot of potential to become that de facto standard. As time went by we've seen it become controlled by commercial interests, bloat its API with vendor concerns, and be at times hostile to the ASF and to those tracing solutions those vendors see as competition. Part of this is how the commercial world works I accept, but I have used it in conference presentations as a counter-example to why the Apache Way is so important when what we want is project stability. With Cassandra we took the approach of maintaining the existing tracing code as the abstraction layer, and allowing plugins to it. This proved the easiest approach as the effort is more in adding the instrumentation code in the first place, and not so much writing the abstraction layer. A Cassandra to Zipkin plugin was added, along with a Cassandra to OpenTracing plugin, but the latter was dropped as it became obvious that writing a Cassandra plugin to whatever tracing solution you wanted was not really so much work. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554695#comment-16554695 ] Steve Loughran commented on HADOOP-15566: - Stack is of course correct: we want this stuff used end-to-end. We do this today with logging across our JARs; we need something beyond logging to track down performance/blame across everything. Avoiding dictating "you must use reporting tool X" for your analysis limits which people will want to use the tracing, and so how broadly it gets used. I don't want to have to worry about what they do with that data, > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553366#comment-16553366 ] Elek, Marton commented on HADOOP-15566: --- [~bensigelman] Thank you very much your answer. It was very informative and reasonable arguments. Especially the last paragraph: {quote} Also, building a general-purpose adapter to convert OpenTracing instrumentation into OpenCensus API calls would be straightforward (due to the relative "thickness" and numbers of implementation assumptions made in each project). Going the other way would be challenging or impossible, depending on reliance on OpenCensus wire formats. {quote} > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553144#comment-16553144 ] stack commented on HADOOP-15566: For me, the hard part is not which tracing lib to use -- if a tracing lib discussion, lets do it out on dev? We should also invite others to the discussion -- but rather discussion around resourcing: * Ensuring traces tell a good narrative across the different code paths and over processes, and that trace paths remain intact across code churn; they are brittle and easily broken/disconnected as dev goes on. * Instrumenting/coverage -- inserting trace points is time consuming whose value is only realized down-the-road by operator/dev trying to figure a slowdown (so the https://github.com/opentracing-contrib/java-tracerresolver looks interesting). * Tooling to enable tracing and visualize needs to be easy-to-deploy and use else all will go to rot (Some orgs trace every transaction with a simple switch for dumping to visualizer that is up and always available..) * Ensuring traces are friction-free else they'll be removed or not taken-on in the first place. * Evangelizing and pushing trace across hadoop components; the more components instrumented, the more we all will benefit. Thanks. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552907#comment-16552907 ] Ben Sigelman commented on HADOOP-15566: --- [~elek] the projects have similar goals but take different approaches. OpenTracing's surface area is intentionally "as narrow as possible" which means that it brings in almost no dependencies (OpenCensus is more of a fully-featured "agent" model, which necessarily gives it a larger footprint). OpenTracing also makes no assumptions about the serialization formats (or header names, etc) between peered processes in the distributed system/application, or the serialization format of the tracing system itself. This means that OpenTracing instrumentation can be used/reused for a wider variety of things: straightforward distributed trace collectors/indexers/viewers like Zipkin, Jaeger, etc, but also distributed debuggers, security applications, and so forth. Also, building a general-purpose adapter to convert OpenTracing instrumentation into OpenCensus API calls would be straightforward (due to the relative "thickness" and numbers of implementation assumptions made in each project). Going the other way would be challenging or impossible, depending on reliance on OpenCensus wire formats. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552808#comment-16552808 ] Elek, Marton commented on HADOOP-15566: --- As far as I know the problem could be solved with both Opentracing and Opencensus. Is there any reason to prefer opentracing? What would be the advantages/disadvantages to use OC/OT? > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535828#comment-16535828 ] Ben Sigelman commented on HADOOP-15566: --- [~ste...@apache.org] I agree that a new field (for the tracing context) makes the most sense from a compatibility standpoint. [~jojochuang]: are there specific blockers or questions you have about the OT port? If so, let me know and I'll do my best to address/answer them. I'm also happy to help with the mechanics of the change (or find someone with more cycles in the OT community to do the same). This can be a really positive thing for HDFS, HBase, etc, as the traces within the datastore/filesystem can be connected to the traces in the application above, regardless of the particular tracing system in use. (I know that at google, it was valuable for both the bigtable core team and for bigtable users to see traces that wend their way from app code into bigtable and back, esp for slow / poorly-constructed queries). > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534634#comment-16534634 ] Steve Loughran commented on HADOOP-15566: - Nice screenshots * The HDFS team need to be involved in all discussions w.r.t tracing and wire protocols; I see Anu is watching; [~jnp] should keep an eye on it too * And hbase, eg @stack. I don't see the existing field being reusable unless there's no risk that an htrace client -> opentrace server or opentrace client to htrace-enabled server isn't going to do bad things. Even if we don't know of anything in production, it's part of our [compatibility definition|http://hadoop.apache.org/docs/r3.0.3/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_Protocols]. Making it a new optional field & ignoring the htrace one is the safe route. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530630#comment-16530630 ] Ben Sigelman commented on HADOOP-15566: --- The screenshots are really nice to see! I'm not sure how you all like to work, but I am happy to help discuss how to make all of this work from an OpenTracing best-practices standpoint (and/or try to find people to help with the instrumentation or porting effort). There was a question about about Tracer impls: {quote}I can see people might want an implementation that is more neutral, For example, Jaeger comes from Uber, and people might not want to use it (hey, any Lyft developers here? :)) {quote} Typically the idiom is to let the user pass in a `Tracer` impl dynamically, but fall back on the `GlobalTracer` mechanism if no user-specified `Tracer` was provided. There's also a contributed (and wholly optional) OpenTracing utility to do `Tracer` injection dynamically (i.e., with zero code modification): [https://github.com/opentracing-contrib/java-tracerresolver] Also, re wire protocols: OpenTracing is designed to be intentionally agnostic about wire protocols and abstracts serialization ("injection") and deserialization ("extraction") into the `Tracer` implementation. If there are questions about best practices around this, please @-mention me and I'll do my best to help. (Thanks again, all) > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530265#comment-16530265 ] Aaron Fabbri commented on HADOOP-15566: --- Nice work [~jojochuang]. I had fun hacking on this for a day. Attaching a screenshot from the S3A tracing I added, uploading a file to S3. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530226#comment-16530226 ] Steve Loughran commented on HADOOP-15566: - bq. we'll need to update client -> namenode RPC messages, as well as client -> datanode RPC, KMS Rest API. So wire compatibility needs to be considered. (Some messages already carries htrace trace id. Would it make sense to replace the htrace trace id field with opentracing trace id field? if it breaks wire compatibility, unless is a protobuf optional field, it'll be an incompatible protocol change. If the field is reused, the servers need to handle the situation of "older client with htrace enabled makes RPC call to server with opentrace". > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530149#comment-16530149 ] Wei-Chiu Chuang commented on HADOOP-15566: -- Attached a screenshot of a hdfs client data write pipeline trace. !Screen Shot 2018-06-29 at 11.59.16 AM.png! > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530144#comment-16530144 ] Wei-Chiu Chuang commented on HADOOP-15566: -- Hi Ben! With the help from [~tlipcon], I worked with [~fabbri] and [~rizaon] and spent a day or two on porting htrace to opentracing. It turns out to be a quite fun exercise. Most of the porting is mechanical, changing htrace span to opentracing span; took me a while to figure out how to pass trace id in opentracing, but doable. I was even able to add a few more tracing code that was lacking before. Some observation I have: # porting the code in Hadoop seems straightforward. # I am not aware of any one using htrace in production. So I don't expect too much resistance in replacing it. (Shout out if this is not the case) # By embracing opentracing, which is becoming the de facto tracing standard, it makes it possible to trace end-to-end, from non-Hadoop applications into Hadoop. Some possible hurdles # To pass trace id around, we'll need to update client -> namenode RPC messages, as well as client -> datanode RPC, KMS Rest API. So wire compatibility needs to be considered. (Some messages already carries htrace trace id. Would it make sense to replace the htrace trace id field with opentracing trace id field? Or should the opentracing trace id be appended? Hopefully there's not much overhead) # opentracing is just a set of APIs. We used Jaeger as the implementation. I can see people might want an implementation that is more neutral, For example, Jaeger comes from Uber, and people might not want to use it (hey, any Lyft developers here? :)) # Community adoption: I am aware Hbase uses Htrace. So if we switch to opentracing, there'll need some coordination to convince HBase community to switch too (I'd be happy to contribute). And I am hoping to convince other communities to adopt opentracing as well. It's not too interesting if opentracing is adopted in Hadoop but not in Hive or Spark or Kafka. Thoughts? > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529150#comment-16529150 ] Ben Sigelman commented on HADOOP-15566: --- Hi all – someone sent this my way. I am one of the opentracing co-creators and would be delighted to collaborate on adding (minimal-dependency, lightweight) OpenTracing instrumentation to ASF projects. In serendipitous news, we have recently added some resources to help with the actual instrumentation work for well-used and well-loved projects like those in the ASF. (PS: I am generally oversubscribed and can be bad at things like JIRA, but am 100% happy to help here... if I'm being flaky about Jira responses, please reach out to me at [b...@gmail.com|mailto:b...@gmail.com] where I maintain a better SLA ;)) > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526619#comment-16526619 ] Aaron Fabbri commented on HADOOP-15566: --- Agreed. Htrace is great but suffered from everyone being too busy to give it the love it needed to develop. We're going to spend some time seeing if we can plug in an opentracing implementation today. Will report back with any interesting findings. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15566) Remove HTrace support
[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525547#comment-16525547 ] Steve Loughran commented on HADOOP-15566: - It's a shame to hear about HTrace. We've an outstanding JIRA to add it to S3A ( HADOOP-12949) , and HADOOP-15407 includes it in the ABFS connector, so I'd like to have an alternative. All we really want in the Hadoop code is the instrumentation to publish information, and to propagate context information all the way down from applications. And we want those applications to wire up to it too, obviously. > Remove HTrace support > - > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 3.1.0 >Reporter: Todd Lipcon >Priority: Major > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org