Hi Xiao,
Performance-wise, without the manual tuning, the query cannot be finished, and
with the tuning the query can finish in minutes in TPCH 100G data.
I have created https://issues.apache.org/jira/browse/SPARK-11704 and
https://issues.apache.org/jira/browse/SPARK-11705 for these two
Hi, Zhan,
That sounds really interesting! Please at me when you submit the PR. If
possible, please also posted the performance difference.
Thanks,
Xiao Li
2015-11-11 14:45 GMT-08:00 Zhan Zhang :
> Hi Folks,
>
> I did some performance measurement based on TPC-H
Hi Folks,
I did some performance measurement based on TPC-H recently, and want to bring
up some performance issue I observed. Both are related to cartesian join.
1. CartesianProduct implementation.
Currently CartesianProduct relies on RDD.cartesian, in which the computation is
realized as