Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16739
@felixcheung I was refering to the ` * However, if you're doing a drastic
coalesce, e.g. to numPartitions = 1,
* this may result in your computation taking place on fewer nodes than
* you like (e.g. one node in the case of numPartitions = 1). To avoid
this,
* you can pass shuffle = true. This will add a shuffle step, but means
the
* current upstream partitions will be executed in parallel (per whatever
* the current partitioning is).
` warning
but documentating the coalesce capping out based on numSlices also sounds
important to document (and potentially confusing).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]