Github user NarineK commented on the pull request:
https://github.com/apache/spark/pull/12966#issuecomment-217739297
Thanks, @sun-rui !
Yes, It seems to be the case @sun-rui .
I've recently hit the case where the number of partitions was less than the
number of actual groups.
I've tried the same thing on my previous implementation with groupByKey ->
flatMap and it works perfectly fine with any repartitioning.
Maybe @davies, has some suggestions about this.
If there is any repartitioner which will guarantee that each group will be
in a single partition then we can use it otherwise this won't give us the
expected result.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]