Thanks Francois for the comment and useful link. I understand the problem
better now.
best,
/Shahab
On Wed, Feb 18, 2015 at 10:36 AM, wrote:
> In a nutshell : because it’s moving all of your data, compared to other
> operations (e.g. reduce) that summarize it in one form or another before
> mo
In a nutshell : because it’s moving all of your data, compared to other
operations (e.g. reduce) that summarize it in one form or another before moving
it.
For the longer answer:
http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_grou
Hi,
Based on what I could see in the Spark UI, I noticed that "groupBy"
transformation is quite slow (taking a lot of time) compared to other
operations.
Is there any reason that groupBy is slow?
shahab