[
https://issues.apache.org/jira/browse/SPARK-33088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251100#comment-17251100
]
Apache Spark commented on SPARK-33088:
--------------------------------------
User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/30823
> Enhance ExecutorPlugin API to include methods for task start and end events
> ---------------------------------------------------------------------------
>
> Key: SPARK-33088
> URL: https://issues.apache.org/jira/browse/SPARK-33088
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Affects Versions: 3.1.0
> Reporter: Samuel Souza
> Assignee: Samuel Souza
> Priority: Major
> Fix For: 3.1.0
>
>
> On [SPARK-24918|https://issues.apache.org/jira/browse/SPARK-24918]'s
> [SIPP|https://docs.google.com/document/d/1a20gHGMyRbCM8aicvq4LhWfQmoA5cbHBQtyqIA2hgtc/view#|https://docs.google.com/document/d/1a20gHGMyRbCM8aicvq4LhWfQmoA5cbHBQtyqIA2hgtc/edit#],
> it was raised to potentially add methods to ExecutorPlugin interface on task
> start and end:
> {quote}The basic interface can just be a marker trait, as that allows a
> plugin to monitor general characteristics of the JVM (eg. monitor memory or
> take thread dumps). Optionally, we could include methods for task start and
> end events. This would allow more control on monitoring – eg., you could
> start polling thread dumps only if there was a task from a particular stage
> that had been taking too long. But anything task related is a bit trickier to
> decide the right api. Should the task end event also get the failure reason?
> Should those events get called in the same thread as the task runner, or in
> another thread?
> {quote}
> The ask is to add exactly that. I've put up a draft PR [in our fork of
> spark|https://github.com/palantir/spark/pull/713] and I'm happy to push it
> upstream. Also happy to receive comments on what's the right interface to
> expose - not opinionated on that front, tried to expose the simplest
> interface for now.
> The main reason for this ask is to propagate tracing information from the
> driver to the executors
> ([SPARK-21962|https://issues.apache.org/jira/browse/SPARK-21962] has some
> context). On
> [HADOOP-15566|https://issues.apache.org/jira/browse/HADOOP-15566] I see we're
> discussing how to add tracing to the Apache ecosystem, but my problem is
> slightly different: I want to use this interface to propagate tracing
> information to my framework of choice. If the Hadoop issue gets solved we'll
> have a framework to communicate tracing information inside the Apache
> ecosystem, but it's highly unlikely that all Spark users will use the same
> common framework. Therefore we should still provide plugin interfaces where
> the tracing information can be propagated appropriately.
> To give more color, in our case the tracing information is [stored in a
> thread
> local|https://github.com/palantir/tracing-java/blob/4.9.0/tracing/src/main/java/com/palantir/tracing/Tracer.java#L61],
> therefore it needs to be set in the same thread which is executing the task.
> [*]
> While our framework is specific, I imagine such an interface could be useful
> in general. Happy to hear your thoughts about it.
> [*] Something I did not mention was how to propagate the tracing information
> from the driver to the executors. For that I intend to use 1. the driver's
> localProperties, which 2. will be eventually propagated to the executors'
> TaskContext, which 3. I'll be able to access from the methods above.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]