[ https://issues.apache.org/jira/browse/SPARK-33088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570936#comment-17570936 ]
Steve Loughran commented on SPARK-33088: ---------------------------------------- i;m playing with this and IOStatistics collection in hadoop 3.3.3+ (HADOOP-16830). Has anyone any examples of implementations of this i can look at? i'm particularly curious as to how the driver-side plugin can get notified of task start/complete/fail, so can register accumulators, extract their values and publish them, which is what i want to do. Are people using a different plugin point there? > Enhance ExecutorPlugin API to include methods for task start and end events > --------------------------------------------------------------------------- > > Key: SPARK-33088 > URL: https://issues.apache.org/jira/browse/SPARK-33088 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 3.1.0 > Reporter: Samuel Souza > Assignee: Samuel Souza > Priority: Major > Fix For: 3.1.0 > > > On [SPARK-24918|https://issues.apache.org/jira/browse/SPARK-24918]'s > [SIPP|https://docs.google.com/document/d/1a20gHGMyRbCM8aicvq4LhWfQmoA5cbHBQtyqIA2hgtc/view#|https://docs.google.com/document/d/1a20gHGMyRbCM8aicvq4LhWfQmoA5cbHBQtyqIA2hgtc/edit#], > it was raised to potentially add methods to ExecutorPlugin interface on task > start and end: > {quote}The basic interface can just be a marker trait, as that allows a > plugin to monitor general characteristics of the JVM (eg. monitor memory or > take thread dumps). Optionally, we could include methods for task start and > end events. This would allow more control on monitoring – eg., you could > start polling thread dumps only if there was a task from a particular stage > that had been taking too long. But anything task related is a bit trickier to > decide the right api. Should the task end event also get the failure reason? > Should those events get called in the same thread as the task runner, or in > another thread? > {quote} > The ask is to add exactly that. I've put up a draft PR [in our fork of > spark|https://github.com/palantir/spark/pull/713] and I'm happy to push it > upstream. Also happy to receive comments on what's the right interface to > expose - not opinionated on that front, tried to expose the simplest > interface for now. > The main reason for this ask is to propagate tracing information from the > driver to the executors > ([SPARK-21962|https://issues.apache.org/jira/browse/SPARK-21962] has some > context). On > [HADOOP-15566|https://issues.apache.org/jira/browse/HADOOP-15566] I see we're > discussing how to add tracing to the Apache ecosystem, but my problem is > slightly different: I want to use this interface to propagate tracing > information to my framework of choice. If the Hadoop issue gets solved we'll > have a framework to communicate tracing information inside the Apache > ecosystem, but it's highly unlikely that all Spark users will use the same > common framework. Therefore we should still provide plugin interfaces where > the tracing information can be propagated appropriately. > To give more color, in our case the tracing information is [stored in a > thread > local|https://github.com/palantir/tracing-java/blob/4.9.0/tracing/src/main/java/com/palantir/tracing/Tracer.java#L61], > therefore it needs to be set in the same thread which is executing the task. > [*] > While our framework is specific, I imagine such an interface could be useful > in general. Happy to hear your thoughts about it. > [*] Something I did not mention was how to propagate the tracing information > from the driver to the executors. For that I intend to use 1. the driver's > localProperties, which 2. will be eventually propagated to the executors' > TaskContext, which 3. I'll be able to access from the methods above. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org