Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
> The behavior consistency among Python/Scala/R/JAVA does not mean a bug,
right?
This case specifically `collect` in PySpark doesn't work alone whereas all
other actions like `foreach`, `show` and other cases in other languages works
in all other APIs. Also, that's what a query execution listener describes. Do
you believe you would make this exception for PySpark specifically in any case?
I am seeing `foreach` and etc was fixed
https://github.com/apache/spark/commit/154351e6dbd24c4254094477e3f7defcba979b1a
and also see `collect` is included in the original commit -
https://github.com/apache/spark/commit/15ff85b3163acbe8052d4489a00bcf1d2332fcf0
> I am not against this specific PR. All the committers need to be really
careful when they make a decision to backport a behavior change. If any
committer does it, we should jump in and stop the backport. This is what we
should do.
Let's open a discussion in the mailing list and see if we can see the
agreement. I think this was not the first time we talked about this and think
it's better to open a proper discussion and make a decision.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]