SchemaRDDs, provided by Spark SQL, have a saveAsParquetFile command. You
can turn a normal RDD into a SchemaRDD using the techniques described here:
http://spark.apache.org/docs/latest/sql-programming-guide.html
This should work with Impala, but if you run into any issues please let me
know.
I typed spark parquet into google and the top results was this blog post
about reading and writing parquet files from spark
http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/
On Mon, Jul 7, 2014 at 5:23 PM, Michael Armbrust mich...@databricks.com
wrote:
SchemaRDDs, provided by Spark
Hi,
We are planning to use spark to load data to Parquet and this data will be
query by Impala for present visualization through Tableau.
Can we achieve this flow? How to load data to Parquet from spark? Will
impala be able to access the data loaded by spark?
I will greatly appreciate if