[ https://issues.apache.org/jira/browse/SPARK-20368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972914#comment-15972914 ]
Apache Spark commented on SPARK-20368: -------------------------------------- User 'kxepal' has created a pull request for this issue: https://github.com/apache/spark/pull/17671 > Support Sentry on PySpark workers > --------------------------------- > > Key: SPARK-20368 > URL: https://issues.apache.org/jira/browse/SPARK-20368 > Project: Spark > Issue Type: New Feature > Components: PySpark > Affects Versions: 2.1.0 > Reporter: Alexander Shorin > > [Setry|https://sentry.io] is a well known among Python developers system to > capture, classify, track and explain tracebacks, helping people better > understand what went wrong, how to reproduce the issue and fix it. > Any Spark application on Python is actually divided on two parts: > 1. The one that runs on "driver side". That part user may control in all the > ways it want and provide reports to Sentry is very easy to do here. > 2. The one that runs on executors. That's Python UDFs and the rest > transformation functions. Unfortunately, here we cannot provide such kind of > feature. And that is the part this feature is about. > In order to simplify developing experience, it would be nice to have optional > Sentry support on PySpark worker level. > What this feature could looks like? > 1. PySpark will have new extra named {{sentry}} which installs Sentry client > and the rest required things if are necessary. This is an optional > install-time dependency. > 2. PySpark worker will be able to detect presence of Sentry support and send > error reports there. > 3. All configuration of Sentry could and will be done via standard Sentry`s > environment variables. > What this feature will give to users? > 1. Better exceptions in Sentry. From driver-side application, now all of them > get recorded as like `Py4JJavaError` where the real executor exception is > written in a traceback body. > 2. Greater simplification of understanding context when thing went wrong and > why. > 3. Simplify Python UDFs debug and issues reproduce. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org