Re: Re: sparksql running slow while joining 2 tables.
Can you activate your eventLogs and send them us ? Thank you ! Le mar. 5 mai 2015 à 04:56, luohui20001 luohui20...@sina.com a écrit : Yes,just by default 1 executor.thanks 发自我的小米手机 在 2015年5月4日 下午10:01,ayan guha guha.a...@gmail.com写道: Are you using only 1 executor? On Mon, May 4, 2015 at 11:07 PM, luohui20...@sina.com wrote: hi Olivier spark1.3.1, with java1.8.0.45 and add 2 pics . it seems like a GC issue. I also tried with different parameters like memory size of driverexecutor, memory fraction, java opts... but this issue still happens. Thanksamp;Best regards! 罗辉 San.Luo - 原始邮件 - 发件人:Olivier Girardot ssab...@gmail.com 收件人:luohui20...@sina.com, user user@spark.apache.org 主题:Re: sparksql running slow while joining 2 tables. 日期:2015年05月04日 20点46分 Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit : hi guys when i am running a sql like select a.name,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.name = b.name) where (b.startpoint a.startpoint + 25); I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. Thanksamp;Best regards! 罗辉 San.Luo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Best Regards, Ayan Guha
Re: sparksql running slow while joining 2 tables.
Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit : hi guys when i am running a sql like select a.name,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.name = b.name) where (b.startpoint a.startpoint + 25); I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. Thanksamp;Best regards! 罗辉 San.Luo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Re: sparksql running slow while joining 2 tables.
Are you using only 1 executor? On Mon, May 4, 2015 at 11:07 PM, luohui20...@sina.com wrote: hi Olivier spark1.3.1, with java1.8.0.45 and add 2 pics . it seems like a GC issue. I also tried with different parameters like memory size of driverexecutor, memory fraction, java opts... but this issue still happens. Thanksamp;Best regards! 罗辉 San.Luo - 原始邮件 - 发件人:Olivier Girardot ssab...@gmail.com 收件人:luohui20...@sina.com, user user@spark.apache.org 主题:Re: sparksql running slow while joining 2 tables. 日期:2015年05月04日 20点46分 Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, luohui20...@sina.com a écrit : hi guys when i am running a sql like select a.name,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.name = b.name) where (b.startpoint a.startpoint + 25); I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. Thanksamp;Best regards! 罗辉 San.Luo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Best Regards, Ayan Guha
RE: 回复:Re: sparksql running slow while joining 2 tables.
Can you print out the physical plan? EXPLAIN SELECT xxx… From: luohui20...@sina.com [mailto:luohui20...@sina.com] Sent: Monday, May 4, 2015 9:08 PM To: Olivier Girardot; user Subject: 回复:Re: sparksql running slow while joining 2 tables. hi Olivier spark1.3.1, with java1.8.0.45 and add 2 pics . it seems like a GC issue. I also tried with different parameters like memory size of driverexecutor, memory fraction, java opts... but this issue still happens. Thanksamp;Best regards! 罗辉 San.Luo - 原始邮件 - 发件人:Olivier Girardot ssab...@gmail.commailto:ssab...@gmail.com 收件人:luohui20...@sina.commailto:luohui20...@sina.com, user user@spark.apache.orgmailto:user@spark.apache.org 主题:Re: sparksql running slow while joining 2 tables. 日期:2015年05月04日 20点46分 Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, luohui20...@sina.commailto:luohui20...@sina.com a écrit : hi guys when i am running a sql like select a.namehttp://a.name,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.namehttp://a.name = b.namehttp://b.name) where (b.startpoint a.startpoint + 25); I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. Thanksamp;Best regards! 罗辉 San.Luo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.orgmailto:user-h...@spark.apache.org
RE: 回复:Re: sparksql running slow while joining 2 tables.
Or, have you ever try broadcast join? From: Cheng, Hao [mailto:hao.ch...@intel.com] Sent: Tuesday, May 5, 2015 8:33 AM To: luohui20...@sina.com; Olivier Girardot; user Subject: RE: 回复:Re: sparksql running slow while joining 2 tables. Can you print out the physical plan? EXPLAIN SELECT xxx… From: luohui20...@sina.commailto:luohui20...@sina.com [mailto:luohui20...@sina.com] Sent: Monday, May 4, 2015 9:08 PM To: Olivier Girardot; user Subject: 回复:Re: sparksql running slow while joining 2 tables. hi Olivier spark1.3.1, with java1.8.0.45 and add 2 pics . it seems like a GC issue. I also tried with different parameters like memory size of driverexecutor, memory fraction, java opts... but this issue still happens. Thanksamp;Best regards! 罗辉 San.Luo - 原始邮件 - 发件人:Olivier Girardot ssab...@gmail.commailto:ssab...@gmail.com 收件人:luohui20...@sina.commailto:luohui20...@sina.com, user user@spark.apache.orgmailto:user@spark.apache.org 主题:Re: sparksql running slow while joining 2 tables. 日期:2015年05月04日 20点46分 Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, luohui20...@sina.commailto:luohui20...@sina.com a écrit : hi guys when i am running a sql like select a.namehttp://a.name,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.namehttp://a.name = b.namehttp://b.name) where (b.startpoint a.startpoint + 25); I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. Thanksamp;Best regards! 罗辉 San.Luo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.orgmailto:user-h...@spark.apache.org