Sounds like a bug, if you can reproduce on 1.6.3 (currently being voted
on), then please open a JIRA.
On Thu, Nov 3, 2016 at 8:05 AM, Donald Matthews wrote:
> While upgrading a program from Spark 1.5.2 to Spark 1.6.2, I've run into a
> HiveContext GROUP BY that no longer
While upgrading a program from Spark 1.5.2 to Spark 1.6.2, I've run into a
HiveContext GROUP BY that no longer works reliably.
The GROUP BY results are not always fully aggregated; instead, I get lots
of duplicate + triplicate sets of group values.
I've come up with a workaround that works for