Re: Join two dataframe - Timeout after 5 minutes

2015-09-24 Thread Shixiong Zhu
You can change "spark.sql.broadcastTimeout" to increase the timeout. The
default value is 300 seconds.

Best Regards,
Shixiong Zhu

2015-09-24 15:16 GMT+08:00 Eyad Sibai :

> I am trying to join two tables using dataframes using python 3.4 and I am
> getting the following error
>
>
> I ran it on my localhost machine with 2 workers, spark 1.5
>
>
> I always get timeout if the job takes more than 5 minutes.
>
>
>
> at
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:49)
>
>  at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate.doExecute(TungstenAggregate.scala:69)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>
>  at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>
>  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>
>  at
> org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:142)
>
>  at
> org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:141)
>
>  at
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
>
>  ... 33 more
>
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [300 seconds]
>
>  at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>
>  at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>
>  at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>
>  at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>
>  at scala.concurrent.Await$.result(package.scala:107)
>
>  at
> org.apache.spark.sql.execution.joins.BroadcastHashJoin.doExecute(BroadcastHashJoin.scala:110)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>
>  at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>
>  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>
>  at
> org.apache.spark.sql.execution.TungstenProject.doExecute(basicOperators.scala:86)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>
>  at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>
>  at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>
>  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>
>  at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:119)
>
>  at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:69)
>
>  at
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
>
>  ... 41 more
>
>
> 2015-09-23 15:44:09,536 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static/sql,null}
>
> 2015-09-23 15:44:09,537 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL1/execution/json,null}
>
> 2015-09-23 15:44:09,538 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL1/execution,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL1/json,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL1,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static/sql,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/execution/json,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/execution,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/json,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL,null}
>
> 2015-09-23 15:44:09,539 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/metrics/json,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/api,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/threadDump,null}
>
> 2015-09-23 15:44:09,540 INFO ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/json,null}
>
> 2015-09-23 15:44:09,540 INFO ContextH

Join two dataframe - Timeout after 5 minutes

2015-09-24 Thread Eyad Sibai
I am trying to join two tables using dataframes using python 3.4 and I am 
getting the following error


I ran it on my localhost machine with 2 workers, spark 1.5


I always get timeout if the job takes more than 5 minutes.




at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:49)
 at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate.doExecute(TungstenAggregate.scala:69)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
 at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
 at 
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:142)
 at 
org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:141)
 at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
 ... 33 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 
seconds]
 at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
 at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
 at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
 at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
 at scala.concurrent.Await$.result(package.scala:107)
 at 
org.apache.spark.sql.execution.joins.BroadcastHashJoin.doExecute(BroadcastHashJoin.scala:110)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
 at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
 at 
org.apache.spark.sql.execution.TungstenProject.doExecute(basicOperators.scala:86)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
 at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
 at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:119)
 at 
org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:69)
 at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
 ... 41 more


2015-09-23 15:44:09,536 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/static/sql,null}
2015-09-23 15:44:09,537 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL1/execution/json,null}
2015-09-23 15:44:09,538 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL1/execution,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL1/json,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL1,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/static/sql,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL/execution/json,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL/execution,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL/json,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/SQL,null}
2015-09-23 15:44:09,539 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/metrics/json,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/api,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/static,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/json,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment/json,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment,null}
2015-09-23 15:44:09,540 INFO ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
2015-09-23 15:44:09,541 INFO ContextHan