from:"Dan Dietterich"

Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I am trying to load data from csv format into parquet using Spark SQL. It consistently runs out of memory. The environment is: * standalone cluster using HDFS and Hive metastore from HDP2.0 * spark1.1.0 * parquet jar files (v1.5) explicitly added when starting spark-sql.

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I have only been using spark through the SQL front-end (CLI or JDBC). I don't think I have access to saveAsParquetFile from there, do I? -- View this message in context: