Hi all,
I might be missing something, but does the new Spark 1.3 sqlContext save
interface support using Avro as the schema structure when writing Parquet
files, in a similar way to AvroParquetWriter (which I've got working)?
I've seen how you can load an avro file and save it as parquet from
Dataframes? If so,
that will be a lot simpler!
Thanks,
Ewan
*From:*Cheng Lian [mailto:lian.cs@gmail.com]
*Sent:* 19 May 2015 11:01
*To:* Ewan Leith; user@spark.apache.org
*Subject:* Re: AvroParquetWriter equivalent in Spark 1.3 sqlContext
Save or createDataFrame Interfaces?
Hi Ewan,
Different
Thanks Cheng, that's brilliant, you've saved me a headache.
Ewan
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: 19 May 2015 11:58
To: Ewan Leith; user@spark.apache.org
Subject: Re: AvroParquetWriter equivalent in Spark 1.3 sqlContext Save or
createDataFrame Interfaces?
That's right
Hi Ewan,
Different from AvroParquetWriter, in Spark SQL we uses StructType as the
intermediate schema format. So when converting Avro files to Parquet
files, we internally converts Avro schema to Spark SQL StructType first,
and then convert StructType to Parquet schema.
Cheng
On 5/19/15
Lian [mailto:lian.cs@gmail.com]
Sent: 19 May 2015 11:01
To: Ewan Leith; user@spark.apache.org
Subject: Re: AvroParquetWriter equivalent in Spark 1.3 sqlContext Save or
createDataFrame Interfaces?
Hi Ewan,
Different from AvroParquetWriter, in Spark SQL we uses StructType as the
intermediate