Re: Data loading to Parquet using spark

2014-07-07 Thread Michael Armbrust
SchemaRDDs, provided by Spark SQL, have a saveAsParquetFile command. You can turn a normal RDD into a SchemaRDD using the techniques described here: http://spark.apache.org/docs/latest/sql-programming-guide.html This should work with Impala, but if you run into any issues please let me know.

Re: Data loading to Parquet using spark

2014-07-07 Thread Soren Macbeth
I typed spark parquet into google and the top results was this blog post about reading and writing parquet files from spark http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/ On Mon, Jul 7, 2014 at 5:23 PM, Michael Armbrust mich...@databricks.com wrote: SchemaRDDs, provided by Spark

Data loading to Parquet using spark

2014-07-06 Thread Shaikh Riyaz
Hi, We are planning to use spark to load data to Parquet and this data will be query by Impala for present visualization through Tableau. Can we achieve this flow? How to load data to Parquet from spark? Will impala be able to access the data loaded by spark? I will greatly appreciate if