[ 
https://issues.apache.org/jira/browse/SPARK-20368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20368:
------------------------------------

    Assignee:     (was: Apache Spark)

> Support Sentry on PySpark workers
> ---------------------------------
>
>                 Key: SPARK-20368
>                 URL: https://issues.apache.org/jira/browse/SPARK-20368
>             Project: Spark
>          Issue Type: New Feature
>          Components: PySpark
>    Affects Versions: 2.1.0
>            Reporter: Alexander Shorin
>
> [Setry|https://sentry.io] is a well known among Python developers system to 
> capture, classify, track and explain tracebacks, helping people better 
> understand what went wrong, how to reproduce the issue and fix it.
> Any Spark application on Python is actually divided on two parts:
> 1. The one that runs on "driver side". That part user may control in all the 
> ways it want and provide reports to Sentry is very easy to do here.
> 2. The one that runs on executors. That's Python UDFs and the rest 
> transformation functions. Unfortunately, here we cannot provide such kind of 
> feature. And that is the part this feature is about.
> In order to simplify developing experience, it would be nice to have optional 
> Sentry support on PySpark worker level.
> What this feature could looks like?
> 1. PySpark will have new extra named {{sentry}} which installs Sentry client 
> and the rest required things if are necessary. This is an optional 
> install-time dependency.
> 2. PySpark worker will be able to detect presence of Sentry support and send 
> error reports there. 
> 3. All configuration of Sentry could and will be done via standard Sentry`s 
> environment variables.
> What this feature will give to users?
> 1. Better exceptions in Sentry. From driver-side application, now all of them 
> get recorded as like `Py4JJavaError` where the real executor exception is 
> written in a traceback body.
> 2. Greater simplification of understanding context when thing went wrong and 
> why.
> 3. Simplify Python UDFs debug and issues reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to