Github user kxepal commented on the issue:
https://github.com/apache/spark/pull/17671
@HyukjinKwon
The specific reason is to simplify debugging and errors understanding by
integration pyspark with one of the most popular error tracking system among
Python developers. E.g. improve user experience.
It's not a maintenance stuff, since you never know when and how your
production will crash and could you even reproduce that issue to track down and
fix the bug. You would like to have this integration on all the time.
What you purpose is to do that stuff on application side. How many UDFs I
should rewrap to make it works? How many times I should tell newcomers about
this custom magic? How many times I should copy-paste that solution between the
projects? I think that way doesn't scale well and brings no fun to pyspark
development. Especially, when you can do that once on pyspark side with no cost.
Could a try that patch with Sentry change your mind about?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]