[ 
https://issues.apache.org/jira/browse/SPARK-21962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438124#comment-16438124
 ] 

Apache Spark commented on SPARK-21962:
--------------------------------------

User 'devaraj-kavali' has created a pull request for this issue:
https://github.com/apache/spark/pull/21071

> Distributed Tracing in Spark
> ----------------------------
>
>                 Key: SPARK-21962
>                 URL: https://issues.apache.org/jira/browse/SPARK-21962
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Andrew Ash
>            Priority: Major
>
> Spark should support distributed tracing, which is the mechanism, widely 
> popularized by Google in the [Dapper 
> Paper|https://research.google.com/pubs/pub36356.html], where network requests 
> have additional metadata used for tracing requests between services.
> This would be useful for me since I have OpenZipkin style tracing in my 
> distributed application up to the Spark driver, and from the executors out to 
> my other services, but the link is broken in Spark between driver and 
> executor since the Span IDs aren't propagated across that link.
> An initial implementation could instrument the most important network calls 
> with trace ids (like launching and finishing tasks), and incrementally add 
> more tracing to other calls (torrent block distribution, external shuffle 
> service, etc) as the feature matures.
> Search keywords: Dapper, Brave, OpenZipkin, HTrace



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to