[ https://issues.apache.org/jira/browse/SPARK-21962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368877#comment-16368877 ]
Paul Doran commented on SPARK-21962: ------------------------------------ I have a working local version for Jaeger implementing the SparkListenerInterface. Is the scope of this issue broader? Either way I'd be happy to open source what I have to provide something similar to: [https://github.com/JasonMWhite/spark-datadog-relay] Thanks > Distributed Tracing in Spark > ---------------------------- > > Key: SPARK-21962 > URL: https://issues.apache.org/jira/browse/SPARK-21962 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Andrew Ash > Priority: Major > > Spark should support distributed tracing, which is the mechanism, widely > popularized by Google in the [Dapper > Paper|https://research.google.com/pubs/pub36356.html], where network requests > have additional metadata used for tracing requests between services. > This would be useful for me since I have OpenZipkin style tracing in my > distributed application up to the Spark driver, and from the executors out to > my other services, but the link is broken in Spark between driver and > executor since the Span IDs aren't propagated across that link. > An initial implementation could instrument the most important network calls > with trace ids (like launching and finishing tasks), and incrementally add > more tracing to other calls (torrent block distribution, external shuffle > service, etc) as the feature matures. > Search keywords: Dapper, Brave, OpenZipkin, HTrace -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org