the right join keys. I’d suggest taking a look at the join execs
> and take a look at how they build the result RDD from the partitions of the
> left and right RDDs.(see doExecute(…)) left/right outer does look
> surprising though.
>
>
>
> You should see something like…
>
&
Hi all,
I found this jira for an issue I ran into recently:
https://issues.apache.org/jira/browse/SPARK-28771
My initial idea for a fix is to change SortMergeJoinExec's (and
ShuffledHashJoinExec) requiredChildDistribution.
At least if all below conditions are met, we could only require a subset
1. Where can I find information on how to run standard performance
tests/benchmarks?
2. Are performance degradations to existing queries that are fixable by new
equivalent queries not allowed for a new major spark version?
On Thu, Jan 2, 2020 at 3:05 PM Brett Marcott
wrote:
> Tha