subject:"\[GitHub\] \[spark\] HyukjinKwon edited a comment on pull request #28661\: \[SPARK\-31849\]\[PYTHON\]\[SQL\] Make PySpark exceptions more Pythonic"

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

2020-05-28 Thread GitBox

HyukjinKwon edited a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635404514

I actually didn't quite care about it but realised that people actually
pretty hate the JVM stacktrace in Python exceptions. Maybe it's because you
(and I .. and most of people in Spark dev ..) are used to Java side.

It reminds me of Holden's talk: ["Debugging PySpark—Or Why is There a JVM
Stack Trace in My
Python?"](https://databricks.com/session/debugging-pyspark-or-why-is-there-a-jvm-stack-trace-in-my-python),
could be one of the references to show users don't quite like it in general.

I also think I should have added some more context in the PR description.
This PR:
- Fixes the whitelisted exceptions such as `AnalysisException` which
usually gives a reasonable and good enough exception message.
- Handles and adds the exceptions from Python UDFs to the whitelisted
exceptions. The exceptions from Python UDFs will always have the same JVM
stacktrace:
https://github.com/apache/spark/blob/95aec091e4d8a45e648ce84d32d912f585eeb151/core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala#L515

If somewhat arbitrary exceptions like a runtime exception, say, from a
shuffle or user-defined exceptions happen, there will be no behaviour changes.

Plus, it will still show the full stacktrace in the log files. So I think
it's okay to remove it from the console. If users want to do a postmortem, they
can check log files. If they can run it again, they can turn on this runtime
configuration and execute one more time to see the JVM stacktrace.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

2020-05-28 Thread GitBox

HyukjinKwon edited a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635404514

If somewhat arbitrary exceptions like a runtime exception, say, from a
shuffle or user-defined exceptions happen, there will be no behaviour changes.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

2020-05-28 Thread GitBox

HyukjinKwon edited a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635404514

If somewhat arbitrary exceptions like a runtime exception, say, from a
shuffle or user-defined exceptions happen, there will be no behaviour changes.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

2020-05-28 Thread GitBox



HyukjinKwon edited a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635404514


   I actually didn't quite care about it but realised that people actually 
pretty hate the JVM stacktrace in Python exceptions. Maybe it's because you 
(and I .. and most of people in Spark dev ..) are used to Java side.
   
   It reminds me of Holden's talk: ["Debugging PySpark—Or Why is There a JVM 
Stack Trace in My 
Python?"](https://databricks.com/session/debugging-pyspark-or-why-is-there-a-jvm-stack-trace-in-my-python),
 could be one of the examples.
   
   I also think I should have added some more context in the PR description. 
This PR:
 - Fixes the whitelisted exceptions such as `AnalysisException` which 
usually gives a reasonable exception message.
 - Handles and adds the exceptions from Python UDFs to the whitelisted 
exceptions.
   
   If somewhat arbitrary exceptions like a runtime exception, say, from a 
shuffle or user-defined exceptions happen, there will be no behaviour changes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

2020-05-28 Thread GitBox



HyukjinKwon edited a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635404514


   I actually didn't quite care about it but realised that people actually 
pretty hate the JVM stacktrace in Python exceptions. Maybe it's because you 
(and I .. and most of people in Spark dev ..) are used to Java side.
   
   It reminds me of Holden's talk: ["Debugging PySpark—Or Why is There a JVM 
Stack Trace in My 
Python?"](https://databricks.com/session/debugging-pyspark-or-why-is-there-a-jvm-stack-trace-in-my-python),
 could be one of the examples.
   
   I also think I should have added some more context in the PR description. 
This PR:
 - Fixes the whitelisted exceptions such as `AnalysisException` which 
usually gives a reasonable and good enough exception message.
 - Handles and adds the exceptions from Python UDFs to the whitelisted 
exceptions.
   
   If somewhat arbitrary exceptions like a runtime exception, say, from a 
shuffle or user-defined exceptions happen, there will be no behaviour changes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

[GitHub] [spark] HyukjinKwon edited a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic

5 matches

Site Navigation

Mail list logo

Footer information