Grouping is applied in the aggregation.
From: holden.ka...@gmail.com [mailto:holden.ka...@gmail.com] On Behalf Of
Holden Karau
Sent: Thu, Mar 10, 2016 13:56
To: Gerhard Fiedler
Cc: user@spark.apache.org
Subject: Re: Partitioning to speed up processing?
Are they entire data set aggregates or is
Are they entire data set aggregates or is there some grouping applied?
On Thursday, March 10, 2016, Gerhard Fiedler
wrote:
> I have a number of queries that result in a sequence Filter > Project >
> Aggregate. I wonder whether partitioning the input table makes sense.
>
>
>
> Does Aggregate bene
I have a number of queries that result in a sequence Filter > Project >
Aggregate. I wonder whether partitioning the input table makes sense.
Does Aggregate benefit from a partitioned input? If so, what partitions would
be most useful (related to the aggregations)?
Do Filter and Project preserv