BryanCutler commented on issue #24834: [WIP][SPARK-27992][PYTHON] Synchronize with Python connection thread to propagate errors URL: https://github.com/apache/spark/pull/24834#issuecomment-500626611 I'm attaching error messages from `toPandas()` 1. master without Arrow [master_toPandas_error_no_arrow.txt](https://github.com/apache/spark/files/3274086/master_toPandas_error_no_arrow.txt) 2. master with Arrow [master_toPandas_error_with_arrow.txt](https://github.com/apache/spark/files/3274087/master_toPandas_error_with_arrow.txt) 3. this PR with Arrow [SPARK-27992_toPandas_error_with_arrow.txt](https://github.com/apache/spark/files/3274088/SPARK-27992_toPandas_error_with_arrow.txt) To sum up the differences: (2) master with arrow, is a little different because the Python error is a `RuntimeError` and not `Py4JJavaError` and for some reason there is a `Driver stacktrace` section that is empty - probably not a big deal but still looks odd and I'm not sure why it is empty. (3) this PR, it is more similar to (1) in that it has a Py4JJavaError and the `Driver stacktrace` is populated correctly, but the exception is actually displayed twice - first when the exception occurs in the JVM serving thread, and then again when it is propagated by Py4j. I'm not sure if there is a good way to avoid this..
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
