Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20567
@ueshin, yup, I initially thought so but realised that it might collect
twice (`_collectAsArrow`, `collect`) and trigger two jobs due to one failure in
execution time. Also, seems it could catch some arbitrarily exceptions in
execution time.
For `createDataFrame`, I thought we are fine because it won't trigger
multiple jobs at least.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]