回复:RE: 回复:Re: sparksql running slow while joining_2_tables.

2015-05-13 Thread luohui20001
Hi Hao: I tried broadcastjoin with following steps, and found that my query is still running slow ,not very sure if I'm doing right with broadcastjoin:1.add "spark.sql.autoBroadcastJoinThreshold 104857600(100MB)" in conf/spark-default.conf. 100MB is larger than any of my 2 tables.2.start

RE: 回复:回复:RE: 回复:Re: sparksql running slow while joining_2_tables.

2015-05-06 Thread java8964
...@intel.com; ssab...@gmail.com; user@spark.apache.org Subject: 回复:回复:RE: 回复:Re: sparksql running slow while joining_2_tables. update status after i did some tests. I modified some other parameters, found 2 parameters maybe relative. spark_worker_instance and spark.sql.shuffle.partitions before Today I

回复:回复:RE: 回复:Re: sparksql running slow while joining_2_tables.

2015-05-05 Thread luohui20001
. Thanks&Best regards! 罗辉 San.Luo - 原始邮件 - 发件人: 收件人:"Cheng, Hao" , "Wang, Daoyuan" , "Olivier Girardot" , "user" , 主题:回复:RE: 回复:Re: sparksql running slow while joining_2_tables. 日期:2015年05月06日 09点51分 db has 1.7mill

RE: 回复:RE: 回复:Re: sparksql running slow while joining_2_tables.

2015-05-04 Thread Wang, Daoyuan
You can use Explain extended select …. From: luohui20...@sina.com [mailto:luohui20...@sina.com] Sent: Tuesday, May 05, 2015 9:52 AM To: Cheng, Hao; Olivier Girardot; user Subject: 回复:RE: 回复:Re: sparksql running slow while joining_2_tables. As I know broadcastjoin is automatically enabled by

回复:RE: 回复:Re: sparksql running slow while joining_2_tables.

2015-05-04 Thread luohui20001
As I know broadcastjoin is automatically enabled by spark.sql.autoBroadcastJoinThreshold.refer to http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options and how to check my app's physical plan,and others things like optimized plan,executable plan.etc thanks -