withColumn is very slow with datasets with large number of columns

2015-04-30 Thread alexandre Clement
Hi all, I'm experimenting serious performance problem when using withColumn and dataset with large number of columns. It is very slow: on a dataset with 100 columns it takes a few seconds. The code snippet demonstrates the problem. val custs = Seq( Row(1, "Bob", 21, 80.5), Row(2, "Bobby", 21,

Re: withColumn is very slow with datasets with large number of columns

2015-04-30 Thread alexandre Clement
I have reported the issue on JIRA: https://issues.apache.org/jira/browse/SPARK-7276 On Thu, Apr 30, 2015 at 4:36 PM, alexandre Clement wrote: > Hi all, > > > I'm experimenting serious performance problem when using withColumn and > dataset with large number of columns. I