Re: Re: sparksql running slow while joining 2 tables.

2015-05-05 Thread Olivier Girardot
Can you activate your eventLogs and send them us ?
Thank you !

Le mar. 5 mai 2015 à 04:56, luohui20001 luohui20...@sina.com a écrit :

 Yes,just by default 1 executor.thanks



 发自我的小米手机
 在 2015年5月4日 下午10:01,ayan guha guha.a...@gmail.com写道:

 Are you using only 1 executor?

 On Mon, May 4, 2015 at 11:07 PM, luohui20...@sina.com wrote:

 hi Olivier

 spark1.3.1, with java1.8.0.45

 and add 2 pics .

 it seems like a GC issue. I also tried with different parameters like
 memory size of driverexecutor, memory fraction, java opts...

 but this issue still happens.


 

 Thanksamp;Best regards!
 罗辉 San.Luo

 - 原始邮件 -
 发件人:Olivier Girardot ssab...@gmail.com
 收件人:luohui20...@sina.com, user user@spark.apache.org
 主题:Re: sparksql running slow while joining 2 tables.
 日期:2015年05月04日 20点46分

 Hi,
 What is you Spark version ?

 Regards,

 Olivier.

 Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit :

 hi guys

 when i am running a sql  like select a.name,a.startpoint,a.endpoint,
 a.piece from db a join sample b on (a.name = b.name) where (b.startpoint
  a.startpoint + 25); I found sparksql running slow in minutes which may
 caused by very long GC and shuffle time.


table db is created from a txt file size at 56mb while table
 sample sized at 26mb, both at small size.

my spark cluster is a standalone  pseudo-distributed spark
 cluster with 8g executor and 4g driver manager.

any advises? thank you guys.



 

 Thanksamp;Best regards!
 罗辉 San.Luo

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 Best Regards,
 Ayan Guha




Re: sparksql running slow while joining 2 tables.

2015-05-04 Thread Olivier Girardot
Hi,
What is you Spark version ?

Regards,

Olivier.

Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit :

 hi guys

 when i am running a sql  like select a.name,a.startpoint,a.endpoint,
 a.piece from db a join sample b on (a.name = b.name) where (b.startpoint
  a.startpoint + 25); I found sparksql running slow in minutes which may
 caused by very long GC and shuffle time.


table db is created from a txt file size at 56mb while table sample
 sized at 26mb, both at small size.

my spark cluster is a standalone  pseudo-distributed spark
 cluster with 8g executor and 4g driver manager.

any advises? thank you guys.



 

 Thanksamp;Best regards!
 罗辉 San.Luo

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


Re: Re: sparksql running slow while joining 2 tables.

2015-05-04 Thread ayan guha
Are you using only 1 executor?

On Mon, May 4, 2015 at 11:07 PM, luohui20...@sina.com wrote:

 hi Olivier

 spark1.3.1, with java1.8.0.45

 and add 2 pics .

 it seems like a GC issue. I also tried with different parameters like
 memory size of driverexecutor, memory fraction, java opts...

 but this issue still happens.


 

 Thanksamp;Best regards!
 罗辉 San.Luo

 - 原始邮件 -
 发件人:Olivier Girardot ssab...@gmail.com
 收件人:luohui20...@sina.com, user user@spark.apache.org
 主题:Re: sparksql running slow while joining 2 tables.
 日期:2015年05月04日 20点46分

 Hi,
 What is you Spark version ?

 Regards,

 Olivier.

 Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit :

 hi guys

 when i am running a sql  like select a.name,a.startpoint,a.endpoint,
 a.piece from db a join sample b on (a.name = b.name) where (b.startpoint
  a.startpoint + 25); I found sparksql running slow in minutes which may
 caused by very long GC and shuffle time.


table db is created from a txt file size at 56mb while table sample
 sized at 26mb, both at small size.

my spark cluster is a standalone  pseudo-distributed spark
 cluster with 8g executor and 4g driver manager.

any advises? thank you guys.



 

 Thanksamp;Best regards!
 罗辉 San.Luo

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




-- 
Best Regards,
Ayan Guha


RE: 回复:Re: sparksql running slow while joining 2 tables.

2015-05-04 Thread Cheng, Hao
Can you print out the physical plan?

EXPLAIN SELECT xxx…

From: luohui20...@sina.com [mailto:luohui20...@sina.com]
Sent: Monday, May 4, 2015 9:08 PM
To: Olivier Girardot; user
Subject: 回复:Re: sparksql running slow while joining 2 tables.


hi Olivier

spark1.3.1, with java1.8.0.45

and add 2 pics .

it seems like a GC issue. I also tried with different parameters like memory 
size of driverexecutor, memory fraction, java opts...

but this issue still happens.



Thanksamp;Best regards!
罗辉 San.Luo

- 原始邮件 -
发件人:Olivier Girardot ssab...@gmail.commailto:ssab...@gmail.com
收件人:luohui20...@sina.commailto:luohui20...@sina.com, user 
user@spark.apache.orgmailto:user@spark.apache.org
主题:Re: sparksql running slow while joining 2 tables.
日期:2015年05月04日 20点46分

Hi,
What is you Spark version ?

Regards,

Olivier.

Le lun. 4 mai 2015 à 11:03, luohui20...@sina.commailto:luohui20...@sina.com 
a écrit :

hi guys

when i am running a sql  like select 
a.namehttp://a.name,a.startpoint,a.endpoint, a.piece from db a join sample b 
on (a.namehttp://a.name = b.namehttp://b.name) where (b.startpoint  
a.startpoint + 25); I found sparksql running slow in minutes which may caused 
by very long GC and shuffle time.



   table db is created from a txt file size at 56mb while table sample 
sized at 26mb, both at small size.

   my spark cluster is a standalone  pseudo-distributed spark cluster with 
8g executor and 4g driver manager.

   any advises? thank you guys.





Thanksamp;Best regards!
罗辉 San.Luo

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org


RE: 回复:Re: sparksql running slow while joining 2 tables.

2015-05-04 Thread Cheng, Hao
Or, have you ever try broadcast join?

From: Cheng, Hao [mailto:hao.ch...@intel.com]
Sent: Tuesday, May 5, 2015 8:33 AM
To: luohui20...@sina.com; Olivier Girardot; user
Subject: RE: 回复:Re: sparksql running slow while joining 2 tables.

Can you print out the physical plan?

EXPLAIN SELECT xxx…

From: luohui20...@sina.commailto:luohui20...@sina.com 
[mailto:luohui20...@sina.com]
Sent: Monday, May 4, 2015 9:08 PM
To: Olivier Girardot; user
Subject: 回复:Re: sparksql running slow while joining 2 tables.


hi Olivier

spark1.3.1, with java1.8.0.45

and add 2 pics .

it seems like a GC issue. I also tried with different parameters like memory 
size of driverexecutor, memory fraction, java opts...

but this issue still happens.



Thanksamp;Best regards!
罗辉 San.Luo

- 原始邮件 -
发件人:Olivier Girardot ssab...@gmail.commailto:ssab...@gmail.com
收件人:luohui20...@sina.commailto:luohui20...@sina.com, user 
user@spark.apache.orgmailto:user@spark.apache.org
主题:Re: sparksql running slow while joining 2 tables.
日期:2015年05月04日 20点46分

Hi,
What is you Spark version ?

Regards,

Olivier.

Le lun. 4 mai 2015 à 11:03, luohui20...@sina.commailto:luohui20...@sina.com 
a écrit :

hi guys

when i am running a sql  like select 
a.namehttp://a.name,a.startpoint,a.endpoint, a.piece from db a join sample b 
on (a.namehttp://a.name = b.namehttp://b.name) where (b.startpoint  
a.startpoint + 25); I found sparksql running slow in minutes which may caused 
by very long GC and shuffle time.



   table db is created from a txt file size at 56mb while table sample 
sized at 26mb, both at small size.

   my spark cluster is a standalone  pseudo-distributed spark cluster with 
8g executor and 4g driver manager.

   any advises? thank you guys.





Thanksamp;Best regards!
罗辉 San.Luo

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org