Re: [SQL] codegen on wide dataset throws StackOverflow

2015-06-26 Thread Josh Rosen
Which Spark version are you using? Can you file a JIRA for this issue? On Thu, Jun 25, 2015 at 6:35 AM, Peter Rudenko petro.rude...@gmail.com wrote: Hi, i have a small but very wide dataset (2000 columns). Trying to optimize Dataframe pipeline for it, since it behaves very poorly comparing

Re: [SQL] codegen on wide dataset throws StackOverflow

2015-06-26 Thread Peter Rudenko
I'm using spark-1.4.0. Sure will try to make steps to reproduce and file a JIRA ticket. Thanks, Peter Rudenko On 2015-06-26 11:14, Josh Rosen wrote: Which Spark version are you using? Can you file a JIRA for this issue? On Thu, Jun 25, 2015 at 6:35 AM, Peter Rudenko petro.rude...@gmail.com

[SQL] codegen on wide dataset throws StackOverflow

2015-06-25 Thread Peter Rudenko
Hi, i have a small but very wide dataset (2000 columns). Trying to optimize Dataframe pipeline for it, since it behaves very poorly comparing to rdd operation. With spark.sql.codegen=true it throws StackOverflow: 15/06/25 16:27:16 INFO CacheManager: Partition rdd_12_3 not found, computing it