On Thu, Jun 25, 2015 at 5:01 PM, Kannappan Sirchabesan <buildka...@gmail.com > wrote:
> On Jun 26, 2015, at 12:46 AM, Sven Krasser <kras...@gmail.com> wrote: > > In that case the reduceByKey operation will likely not give you any > benefit (since you are not aggregating data into smaller values but instead > building the same large list you'd build with groupByKey). > > > great. thanks!. i overlooked that. I guess it might even be better to use > groupByKey if the aggregated list is very huge for some keys?. > That'd be the most straightforward way to express it, yes. If you can, however, you should do whatever computation you need to do on the list during aggregation (e.g. using combineByKey). -- www.skrasser.com <http://www.skrasser.com/?utm_source=sig>