Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/15835
Ah, I missed the last comment from the old PR. Okay, we can make this
shaped nicer. BTW, Spark collects small partitions for each task so I guess
this would not introduce a lot of tasks always but yes, I guess it is still a
valid point to reduce the number of tasks.
Right, I am fine with this. I thought the original PR was taken over
without the courtesy of giving a notification or talking about this ahead. For
the benchmark, I have a PR with a benchmark in a PR, 14660, which I also
referred from other PRs.
I have just few minor notes which are, maybe `Closes #14649` can be added
at the end of the PR description so that the merge script from committers could
close the original one if this one gets merged. Another one is, there are some
style guide lines I usually refer which are
[wiki](https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide)
and
[databricks/scala-style-guide](https://github.com/databricks/scala-style-guide).
I will leave some comments on the changed files.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]